]> git.proxmox.com Git - mirror_ovs.git/log
mirror_ovs.git
4 years agoofp-monitor: Make OFP_FLOW_REMOVED_REASON_BUFSIZE public.
Ben Pfaff [Wed, 4 Dec 2019 23:06:08 +0000 (15:06 -0800)]
ofp-monitor: Make OFP_FLOW_REMOVED_REASON_BUFSIZE public.

This constant is needed to use ofp_flow_removed_reason_to_string(),
which is itself public.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
4 years agoofp-print: Abbreviate lists of fields in table features output.
Ben Pfaff [Wed, 4 Dec 2019 23:06:07 +0000 (15:06 -0800)]
ofp-print: Abbreviate lists of fields in table features output.

This makes the output both shorter and easier to read.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
4 years agocheckpatch: Check spelling in commit messages.
Ilya Maximets [Sat, 7 Dec 2019 17:14:00 +0000 (18:14 +0100)]
checkpatch: Check spelling in commit messages.

This seems useful as I'm usually making a lot of typing mistakes.
Also, few commonly used words added to the extended dictionary.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: William Tu <u9012063@gmail.com>
4 years agocheckpatch: Skip words containing numbers.
Ilya Maximets [Sat, 7 Dec 2019 17:10:24 +0000 (18:10 +0100)]
checkpatch: Skip words containing numbers.

Words like 'br0' are common and usually references some code or
database objects that should not be targets for spell checking.
So, it's better to skip all the words that has digits inside instead
of ones that starts with numbers.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: William Tu <u9012063@gmail.com>
4 years agocheckpatch: Allow common abbreviations for spell checking.
Ilya Maximets [Sat, 7 Dec 2019 17:02:01 +0000 (18:02 +0100)]
checkpatch: Allow common abbreviations for spell checking.

Abbreviations of Latin expressions like 'i.e.' or 'e.g.' are common
and known by the dictionary.  However, our spell checker is not able
to recognize them because it strips dots out of them.  To avoid this
issue we could pass non-stripped version of the word to the dictionary
checker too.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: William Tu <u9012063@gmail.com>
4 years agodatapath-windows: Do not delete internal port on OID_SWITCH_NIC_DISCONNECT
Jinjun Gao [Sun, 8 Dec 2019 09:28:17 +0000 (09:28 +0000)]
datapath-windows: Do not delete internal port on OID_SWITCH_NIC_DISCONNECT

According to the microsoft doc:
https://docs.microsoft.com/en-us/windows-hardware/drivers/network/hyper-v-extensible-switch-port-and-network-adapter-states
Below OID request sequence is validation:
         OID_SWITCH_NIC_CONNECT -> OID_SWITCH_NIC_DISCONNECT
                  ^                           |
                  |                           V
         OID_SWITCH_NIC_CREATE  <- OID_SWITCH_NIC_DELETE

In above sequence, the windows extensible switch interface assumes the
OID_SWITCH_PORT_CREATE has issued and the port has been created
successfully. If delete the internal port in HvDisconnectNic(),
HvCreateNic() will fail when received OID_SWITCH_NIC_CREATE late because
there is no corresponding port.

Signed-off-by: Jinjun Gao <jinjung@vmware.com>
Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
4 years agodpif-netdev: Retrieve dpif_class from struct dp_netdev.
Ophir Munk [Sun, 8 Dec 2019 14:29:14 +0000 (14:29 +0000)]
dpif-netdev: Retrieve dpif_class from struct dp_netdev.

In case a pmd pointer (struct dp_netdev_pmd_thread *) needs to retrieve
the dpif_class it points at - it can access it as:  pmd->dp->class.  A
second option is to access it as: pmd->dp->dpif->dpif_class. The first
option is safe since there is one dp netdev with a constant pointer to
the dpif class. The second option is not safe since the pointer
pmd->dp->dpif may be changed under the hood, for example, in case there
is a call to dpif_open(). One such scenario is when a netdev bridge is
running while dumping flows statistics with dpctl in parallel:
ovs-appctl dpctl/dump-flows. This commit makes usage of the first
safe option instead of the second option.

Fixes: 30115809da2e ("dpif-netdev: Use netdev-offload API for port lookup while offloading")
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
4 years agoAdd offload packets statistics
zhaozhanxu [Thu, 5 Dec 2019 06:26:25 +0000 (14:26 +0800)]
Add offload packets statistics

Add argument '--offload-stats' for command ovs-appctl bridge/dump-flows
to display the offloaded packets statistics.

The commands display as below:

orignal command:

ovs-appctl bridge/dump-flows br0

duration=574s, n_packets=1152, n_bytes=110768, priority=0,actions=NORMAL
table_id=254, duration=574s, n_packets=0, n_bytes=0, priority=2,recirc_id=0,actions=drop
table_id=254, duration=574s, n_packets=0, n_bytes=0, priority=0,reg0=0x1,actions=controller(reason=)
table_id=254, duration=574s, n_packets=0, n_bytes=0, priority=0,reg0=0x2,actions=drop
table_id=254, duration=574s, n_packets=0, n_bytes=0, priority=0,reg0=0x3,actions=drop

new command with argument '--offload-stats'

Notice: 'n_offload_packets' are a subset of n_packets and 'n_offload_bytes' are
a subset of n_bytes.

ovs-appctl bridge/dump-flows --offload-stats br0

duration=582s, n_packets=1152, n_bytes=110768, n_offload_packets=1107, n_offload_bytes=107992, priority=0,actions=NORMAL
table_id=254, duration=582s, n_packets=0, n_bytes=0, n_offload_packets=0, n_offload_bytes=0, priority=2,recirc_id=0,actions=drop
table_id=254, duration=582s, n_packets=0, n_bytes=0, n_offload_packets=0, n_offload_bytes=0, priority=0,reg0=0x1,actions=controller(reason=)
table_id=254, duration=582s, n_packets=0, n_bytes=0, n_offload_packets=0, n_offload_bytes=0, priority=0,reg0=0x2,actions=drop
table_id=254, duration=582s, n_packets=0, n_bytes=0, n_offload_packets=0, n_offload_bytes=0, priority=0,reg0=0x3,actions=drop

Signed-off-by: zhaozhanxu <zhaozhanxu@163.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
4 years agodpif-netdev-perf: Accurate cycle counter update
Malvika Gupta [Thu, 5 Dec 2019 17:04:20 +0000 (11:04 -0600)]
dpif-netdev-perf: Accurate cycle counter update

The accurate timing implementation in this patch gets the wall clock counter via
cntvct_el0 register access. This call is portable to all aarch64 architectures
and has been verified on an 64-bit arm server.

Suggested-by: Yanqin Wei <yanqin.wei@arm.com>
Reviewed-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Malvika Gupta <malvika.gupta@arm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agodpdk: Update to use DPDK 19.11.
Ian Stokes [Tue, 3 Dec 2019 14:52:57 +0000 (14:52 +0000)]
dpdk: Update to use DPDK 19.11.

This commit adds support for DPDK v19.11, it includes the following
changes.

1. travis: Enable compilation and linkage with dpdk 19.11.

2. sparse: Remove dpdk network headers copies.

   https://patchwork.ozlabs.org/patch/1185256/

3. dpdk: Migrate to new PDUMP API.

   https://patchwork.ozlabs.org/patch/1192971/

4. netdev-dpdk: Prefix network structures with rte_.

   https://patchwork.ozlabs.org/patch/1109733/

5. netdev-dpdk: Update by new color definitions.

   https://patchwork.ozlabs.org/patch/1086089/

6. docs: Update docs to reference 19.11.

7. docs: Add note regarding hotplug and igb_uio requirements.

For credit all authors of the original commits to 'dpdk-latest' with the
above changes been added as co-authors for this commmit.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Co-authored-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Co-authored-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Co-authored-by: Ophir Munk <ophirmu@mellanox.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
4 years agotrivial: Fix erspan coding style.
William Tu [Tue, 3 Dec 2019 21:37:56 +0000 (13:37 -0800)]
trivial: Fix erspan coding style.

Fix indentation and whitespace.

Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Ben Pfaff <blp@ovn.org>
4 years agoAUTHORS: Add Yi Yang.
William Tu [Tue, 3 Dec 2019 21:35:48 +0000 (13:35 -0800)]
AUTHORS: Add Yi Yang.

Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Ben Pfaff <blp@ovn.org>
4 years agoovs-vsctl: unit test for checking fail-mode related
Damijan Skvarc [Tue, 3 Dec 2019 08:33:49 +0000 (09:33 +0100)]
ovs-vsctl: unit test for checking fail-mode related

unit test is introduced which checks fail-mode related commands.

Signed-off-by: Damijan Skvarc <damjan.skvarc@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoofproto-dpif-xlate: Prevent duplicating of traffic to a mirror port
Dmytro Linkin [Tue, 3 Dec 2019 14:11:21 +0000 (16:11 +0200)]
ofproto-dpif-xlate: Prevent duplicating of traffic to a mirror port

Currently ofproto design disallow duplicating output packet on forwarding
and mirroring to/from same ovs port. Next scenario reveal lack of design:
1. Send ping between regular ovs ports (VFs, for ex.), stop it.
2. While rule still exist, make mirror for one of the ports.
Prevent duplicating of traffic to a mirror port.

Fixes: 86e2dcddce85 ("dpif-xlate: Snoop multicast packets and send them properly")
Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Acked-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoconntrack: Support zone limits.
Darrell Ball [Tue, 3 Dec 2019 17:14:17 +0000 (09:14 -0800)]
conntrack: Support zone limits.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoofproto-dpif: Refactor the get capability code.
William Tu [Thu, 21 Nov 2019 19:09:02 +0000 (11:09 -0800)]
ofproto-dpif: Refactor the get capability code.

Make the code simpler by removing the use of
xasprintf and free, and use smap_add_format.

Cc: Ben Pfaff <blp@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Ben Pfaff <blp@ovn.org>
4 years agonetdev: use acquire-release semantics for change_seq in netdev
Yanqin Wei [Tue, 26 Nov 2019 07:35:23 +0000 (15:35 +0800)]
netdev: use acquire-release semantics for change_seq in netdev

"rxq_enabled" of netdev is writen in the vhost thread and read by pmd
thread once it observes 'change_seq' is updated. This patch is to keep
order on aarch64 or other weak memory model CPU to ensure 'rxq_enabled' is
observed before 'change_seq'.

Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
Signed-off-by: Yanqin Wei <Yanqin.Wei@arm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agodatapath: make generic netlink group const
Greg Rose [Mon, 25 Nov 2019 22:20:44 +0000 (14:20 -0800)]
datapath: make generic netlink group const

Upstream commit:
    commit 48e48a70c08a8a68f8697f8b30cb83775bda8001
    Author: stephen hemminger <stephen@networkplumber.org>
    Date:   Wed Jul 16 11:25:52 2014 -0700

    openvswitch: make generic netlink group const

    Generic netlink tables can be const.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The equivalent tables in meter.c and conntrack.c are constified so
it should be safe to do the same for these and will improve
security as well.

Original patch slightly modified for out of tree module.

Passes check-kmod.
Passes Travis.
https://travis-ci.org/gvrose8192/ovs-experimental/builds/616880002

Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agofaq: Correct fragment reassembly release.
Darrell Ball [Tue, 26 Nov 2019 02:39:34 +0000 (18:39 -0800)]
faq: Correct fragment reassembly release.

Correct fragment reassembly release for the userspace datapath.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoofproto-dpif-xlate: Restore table ID on error in xlate_table_action().
Ben Pfaff [Mon, 14 Oct 2019 22:34:21 +0000 (15:34 -0700)]
ofproto-dpif-xlate: Restore table ID on error in xlate_table_action().

Found by inspection.

Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agodebian: Update list of copyright holders.
Ben Pfaff [Wed, 9 Oct 2019 17:33:44 +0000 (10:33 -0700)]
debian: Update list of copyright holders.

The list of copyright holders was incomplete and out of date.  This
updates it based on a "grep" for copyright notices, which I reviewed by
hand.

CC: 942056@bugs.debian.org
Reported-by: Chris Lamb <lamby@debian.org>
Reported-at: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=942056
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoDocumentation: Convert multiple manpages to ReST.
Ben Pfaff [Thu, 10 Oct 2019 21:29:42 +0000 (14:29 -0700)]
Documentation: Convert multiple manpages to ReST.

Tested-by: Numan Siddique <numans@ovn.org>
Acked-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agosparse: Get rid of obsolete rte_flow header.
David Marchand [Thu, 3 Oct 2019 18:11:24 +0000 (20:11 +0200)]
sparse: Get rid of obsolete rte_flow header.

This header had been copied to cope with issues on the dpdk side.
Now that the problems have been fixed [1], let's drop this file as it is
now out of sync with dpdk.

1: https://git.dpdk.org/dpdk/commit/?id=fbb25a3878cc

Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
4 years agoofproto: fix stack-buffer-overflow
Linhaifeng [Fri, 29 Nov 2019 06:13:35 +0000 (06:13 +0000)]
ofproto: fix stack-buffer-overflow

Should use flow->actions not &flow->actions.

here is ASAN report:
=================================================================
==57189==ERROR: AddressSanitizer: stack-buffer-overflow on address 0xffff428fa0e8 at pc 0xffff7f61a520 bp 0xffff428f9420 sp 0xffff428f9498 READ of size 196 at 0xffff428fa0e8 thread T150 (revalidator22)
    #0 0xffff7f61a51f in __interceptor_memcpy (/lib64/libasan.so.4+0xa251f)
    #1 0xaaaad26a3b2b in ofpbuf_put lib/ofpbuf.c:426
    #2 0xaaaad26a30cb in ofpbuf_clone_data_with_headroom lib/ofpbuf.c:248
    #3 0xaaaad26a2e77 in ofpbuf_clone_with_headroom lib/ofpbuf.c:218
    #4 0xaaaad26a2dc3 in ofpbuf_clone lib/ofpbuf.c:208
    #5 0xaaaad23e3993 in ukey_set_actions ofproto/ofproto-dpif-upcall.c:1640
    #6 0xaaaad23e3f03 in ukey_create__ ofproto/ofproto-dpif-upcall.c:1696
    #7 0xaaaad23e553f in ukey_create_from_dpif_flow ofproto/ofproto-dpif-upcall.c:1806
    #8 0xaaaad23e65fb in ukey_acquire ofproto/ofproto-dpif-upcall.c:1984
    #9 0xaaaad23eb583 in revalidate ofproto/ofproto-dpif-upcall.c:2625
    #10 0xaaaad23dee5f in udpif_revalidator ofproto/ofproto-dpif-upcall.c:1076
    #11 0xaaaad26b84ef in ovsthread_wrapper lib/ovs-thread.c:708
    #12 0xffff7e74a8bb in start_thread (/lib64/libpthread.so.0+0x78bb)
    #13 0xffff7e0665cb in thread_start (/lib64/libc.so.6+0xd55cb)

Address 0xffff428fa0e8 is located in stack of thread T150 (revalidator22) at offset 328 in frame
    #0 0xaaaad23e4cab in ukey_create_from_dpif_flow ofproto/ofproto-dpif-upcall.c:1762

  This frame has 4 object(s):
    [32, 96) 'actions'
    [128, 192) 'buf'
    [224, 328) 'full_flow'
    [384, 2432) 'stub' <== Memory access at offset 328 partially underflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported) Thread T150 (revalidator22) created by T0 here:
    #0 0xffff7f5b0f7f in __interceptor_pthread_create (/lib64/libasan.so.4+0x38f7f)
    #1 0xaaaad26b891f in ovs_thread_create lib/ovs-thread.c:792
    #2 0xaaaad23dc62f in udpif_start_threads ofproto/ofproto-dpif-upcall.c:639
    #3 0xaaaad23daf87 in ofproto_set_flow_table ofproto/ofproto-dpif-upcall.c:446
    #4 0xaaaad230ff7f in dpdk_evs_cfg_set vswitchd/bridge.c:1134
    #5 0xaaaad2310097 in bridge_reconfigure vswitchd/bridge.c:1148
    #6 0xaaaad23279d7 in bridge_run vswitchd/bridge.c:3944
    #7 0xaaaad23365a3 in main vswitchd/ovs-vswitchd.c:240
    #8 0xffff7dfb1adf in __libc_start_main (/lib64/libc.so.6+0x20adf)
    #9 0xaaaad230a3d3  (/usr/sbin/ovs-vswitchd-2.7.0-1.1.RC5.001.asan+0x26f3d3)

SUMMARY: AddressSanitizer: stack-buffer-overflow (/lib64/libasan.so.4+0xa251f) in __interceptor_memcpy Shadow bytes around the buggy address:
  0x200fe851f3c0: 00 00 00 00 f1 f1 f1 f1 f8 f2 f2 f2 00 00 00 00
  0x200fe851f3d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f3e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f3f0: 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00 00 00 00
  0x200fe851f400: f2 f2 f2 f2 f8 f8 f8 f8 f8 f8 f8 f8 f2 f2 f2 f2
=>0x200fe851f410: 00 00 00 00 00 00 00 00 00 00 00 00 00[f2]f2 f2
  0x200fe851f420: f2 f2 f2 f2 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f430: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f440: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f450: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f460: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==57189==ABORTING

Acked-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Linhaifeng <haifeng.lin@huawei.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agodpif-netdev: Use netdev-offload API for port lookup while offloading.
Ilya Maximets [Fri, 22 Nov 2019 15:09:14 +0000 (16:09 +0100)]
dpif-netdev: Use netdev-offload API for port lookup while offloading.

Currently, while offloading, userspace datapath tries to lookup netdev
in a local port list of the datapath interface instance.  However,
there is no guarantee that these netdevs are the same netdevs that
netdev-offload module operates with and, as a result, there is no any
guarantee that these netdev instances has initialized flow API.

dpif-netdev should request ports from the netdev-offload module as
intended by flow offloading API in a same way as dpif-netlink does.
This will also give us performance benefits because we don't need to
hold global port mutex anymore.

We're not noticing any significant issues with current code, but
it will become a serious issue in the future, e.g. with offloading
for virtual tunneling ports.

Reported-by: Ophir Munk <ophirmu@mellanox.com>
Fixes: 241bad15d99a ("dpif-netdev: associate flow with a mark id")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Eli Britstein <elibr@mellanox.com>
4 years agoofproto-provider: Move datapath capabilities callback to correct section.
Ilya Maximets [Tue, 26 Nov 2019 19:52:32 +0000 (20:52 +0100)]
ofproto-provider: Move datapath capabilities callback to correct section.

'get_datapath_cap' callback was mistakenly placed in
'Connection tracking' section of the 'struct dpif_class'
while belongs to the 'Datapath information'.

CC: William Tu <u9012063@gmail.com>
Fixes: 27501802d09f ("ofproto-dpif: Expose datapath capability to ovsdb.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
4 years agodp-packet: Fix clearing/copying of memory layout flags.
Ilya Maximets [Thu, 21 Nov 2019 13:14:52 +0000 (14:14 +0100)]
dp-packet: Fix clearing/copying of memory layout flags.

'ol_flags' of DPDK mbuf could contain bits responsible for external
or indirect buffers which are not actually offload flags in a common
sense.  Clearing/copying of these flags could lead to memory leaks of
external memory chunks and crashes due to access to wrong memory.

OVS should not clear these flags while resetting offloads and also
should not copy them to the newly allocated packets.

This change is required to support DPDK 19.11, as some drivers may
return mbufs with these flags set.  However, it might be good to do
the same for DPDK 18.11, because these flags are present and should
be taken into account.

Fixes: 03f3f9c0faf8 ("dpdk: Update to use DPDK 18.11.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
4 years agonetdev-dpdk: Deprecate ring ports.
Ilya Maximets [Tue, 26 Nov 2019 10:43:58 +0000 (11:43 +0100)]
netdev-dpdk: Deprecate ring ports.

'dpdkr' a.k.a. DPDK ring ports has really poor support in OVS and not
tested on a regular basis.  These ports are intended to work via
shared memory with another DPDK secondary process, but there are lots
of limitations for using this functionality in practice.  Most of them
connected with running secondary DPDK application and memory layout
issues.  More details are available in DPDK guide:
https://doc.dpdk.org/guides-18.11/prog_guide/multi_proc_support.html#multi-process-limitations

Beside the functional limitations it's also hard to use this
functionality correctly.  User must be sure that OVS and secondary DPDK
application are running on different CPU cores, which is hard because
non-PMD threads could float over available CPU cores.  This or any
other misconfiguration will likely lead to crash of OVS.

Another problem is that the user must actually build the secondary
application with the same version of DPDK that was used for OVS build.

Above issues are same as we have while using DPDK pdump.

Beside that, current implementation in OVS is not able to free
allocated rings that could lead to memory exhausting.

Initially these ports was added to use with IVSHMEM for a fast
zero-copy HOST<-->VM communication.  However, IVSHMEM is not used
anymore.  IVSHMEM support was removed from DPDK in 16.11 release
(instructions for IVSHMEM were removed from the OVS docs almost 3 years
ago by commit 90ca71dd317f ("doc: Remove ivshmem instructions.")) and
the patch for QEMU for using regular files as a device backend is no
longer available.  That makes DPDK ring ports barely useful in real
virtualization environment.

This patch adds a deprecation warnings for run-time port creation
and documentation.  Claiming to completely remove this functionality
from OVS in one of the next releases.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Ian Stokes <ian.stokes@intel.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
4 years agodpdk: Use DPDK 18.11.5 release.
Ian Stokes [Tue, 26 Nov 2019 12:03:04 +0000 (12:03 +0000)]
dpdk: Use DPDK 18.11.5 release.

Modify travis linux build script to use the latest DPDK stable release
18.11.5. Update docs for latest DPDK stable releases.

Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
4 years agoofproto: Fix crash on PACKET_OUT due to recursive locking after upcall.
Ilya Maximets [Fri, 1 Nov 2019 21:24:39 +0000 (22:24 +0100)]
ofproto: Fix crash on PACKET_OUT due to recursive locking after upcall.

Handling of OpenFlow PACKET_OUT implies pushing the packet through
the datapath and packet processing inside the datapath could trigger
an upcall.  In case of system datapath, 'dpif_execute()' sends packet
to the kernel module and returns.  If any, upcall  will be triggered
inside the kernel and handled by a separate thread in userspace.
But in case of userspace datapath full processing of the packet happens
inside the 'dpif_execute()' in the same thread that handled PACKET_OUT.
This causes an issue if upcall will lead to modification of flow rules.
For example, it could happen while processing of 'learn' actions.
Since whole handling of PACKET_OUT is protected by 'ofproto_mutex',
OVS will assert on attempt to take it recursively while processing
'learn' actions:

   0 __GI_raise (sig=sig@entry=6)
   1 __GI_abort ()
   2 ovs_abort_valist ()
   3 ovs_abort ()
   4 ovs_mutex_lock_at (where=where@entry=0xad4199 "ofproto/ofproto.c:5391")
                <Trying to acquire ofproto_mutex again>
   5 ofproto_flow_mod_learn ()       at ofproto/ofproto.c:5391
                <Trying to modify flows according to 'learn' action>
   6 xlate_learn_action ()           at ofproto/ofproto-dpif-xlate.c:5378
                <'learn' action found>
   7 do_xlate_actions ()             at ofproto/ofproto-dpif-xlate.c:6893
   8 xlate_recursively ()            at ofproto/ofproto-dpif-xlate.c:4233
   9 xlate_table_action ()           at ofproto/ofproto-dpif-xlate.c:4361
  10 in xlate_ofpact_resubmit ()     at ofproto/ofproto-dpif-xlate.c:4672
  11 do_xlate_actions ()             at ofproto/ofproto-dpif-xlate.c:6773
  12 xlate_actions ()                at ofproto/ofproto-dpif-xlate.c:7570
                 <Translating actions>
  13 upcall_xlate ()                 at ofproto/ofproto-dpif-upcall.c:1197
  14 process_upcall ()               at ofproto/ofproto-dpif-upcall.c:1413
  15 upcall_cb ()                    at ofproto/ofproto-dpif-upcall.c:1315
  16 dp_netdev_upcall (DPIF_UC_MISS) at lib/dpif-netdev.c:6236
                 <Flow cache miss. Making upcall>
  17 handle_packet_upcall ()         at lib/dpif-netdev.c:6591
  18 fast_path_processing ()         at lib/dpif-netdev.c:6709
  19 dp_netdev_input__ ()            at lib/dpif-netdev.c:6797
  20 dp_netdev_recirculate ()        at lib/dpif-netdev.c:6842
  21 dp_execute_cb ()                at lib/dpif-netdev.c:7158
  22 odp_execute_actions ()          at lib/odp-execute.c:794
  23 dp_netdev_execute_actions ()    at lib/dpif-netdev.c:7332
  24 dpif_netdev_execute ()          at lib/dpif-netdev.c:3725
  25 dpif_netdev_operate ()          at lib/dpif-netdev.c:3756
                 <Packet pushed to userspace datapath for processing>
  26 dpif_operate ()                 at lib/dpif.c:1367
  27 dpif_execute ()                 at lib/dpif.c:1321
  28 packet_execute ()               at ofproto/ofproto-dpif.c:4760
  29 ofproto_packet_out_finish ()    at ofproto/ofproto.c:3594
                 <Taking ofproto_mutex>
  30 handle_packet_out ()            at ofproto/ofproto.c:3635
  31 handle_single_part_openflow (OFPTYPE_PACKET_OUT) at ofproto/ofproto.c:8400
  32 handle_openflow ()                               at ofproto/ofproto.c:8587
  33 ofconn_run ()
  34 connmgr_run ()
  35 ofproto_run ()
  36 bridge_run__ ()
  37 bridge_run ()
  38 main ()

Fix that by splitting the 'ofproto_packet_out_finish()' in two parts.
First one that translates side-effects and requires holding 'ofproto_mutex'
and the second that only pushes packet to datapath.  The second part moved
out of 'ofproto_mutex' because 'ofproto_mutex' is not required and actually
should not be taken in order to avoid recursive locking.

Reported-by: Anil Kumar Koli <anilkumar.k@altencalsoftlabs.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2019-April/048494.html
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
4 years agovswitch.xml: Replace non-ASCII characters.
Ilya Maximets [Mon, 25 Nov 2019 10:07:42 +0000 (11:07 +0100)]
vswitch.xml: Replace non-ASCII characters.

This fixes OSX build on Travis:

ovs-vswitchd.conf.db.5:4061: warning: invalid input character code 128
ovs-vswitchd.conf.db.5:4061: warning: invalid input character code 156

Fixes: aa453e319961 ("ofproto-dpif: Expose datapath ND Extensions capability to ovsdb")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
4 years agoofproto-dpif: Expose datapath ND Extensions capability to ovsdb
Flavio Leitner [Fri, 22 Nov 2019 18:09:02 +0000 (15:09 -0300)]
ofproto-dpif: Expose datapath ND Extensions capability to ovsdb

Document and expose datapath ND Extensions capability to ovsdb.

Fixes: d0d571493 ("ofproto-dpif: Allow IPv6 ND Extensions only if supported")
Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoofproto-dpif-upcall: Echo HASH attribute back to datapath.
Tonghao Zhang [Fri, 15 Nov 2019 02:58:59 +0000 (10:58 +0800)]
ofproto-dpif-upcall: Echo HASH attribute back to datapath.

The kernel datapath may sent upcall with hash info,
ovs-vswitchd should get it from upcall and then send
it back.

The reason is that:
| When using the kernel datapath, the upcall don't
| include skb hash info relatived. That will introduce
| some problem, because the hash of skb is important
| in kernel stack. For example, VXLAN module uses
| it to select UDP src port. The tx queue selection
| may also use the hash in stack.
|
| Hash is computed in different ways. Hash is random
| for a TCP socket, and hash may be computed in hardware,
| or software stack. Recalculation hash is not easy.
|
| There will be one upcall, without information of skb
| hash, to ovs-vswitchd, for the first packet of a TCP
| session. The rest packets will be processed in Open vSwitch
| modules, hash kept. If this tcp session is forward to
| VXLAN module, then the UDP src port of first tcp packet
| is different from rest packets.
|
| TCP packets may come from the host or dockers, to Open vSwitch.
| To fix it, we store the hash info to upcall, and restore hash
| when packets sent back.

Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-October/364062.html
Link: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=bd1903b7c4596ba6f7677d0dfefd05ba5876707d
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoDatapath: Change in openvswitch kernel module to support MPLS label depth of 3 in...
Martin Varghese [Fri, 22 Nov 2019 06:07:46 +0000 (11:37 +0530)]
Datapath: Change in openvswitch kernel module to support MPLS label depth of 3 in ingress direction.

Upstream commit:
    commit fbdcdd78da7c95f1b970d371e1b23cbd3aa990f3
    Author: Martin Varghese <martin.varghese@nokia.com>
    Date:   Mon Nov 4 07:27:44 2019 +0530

    Change in Openvswitch to support MPLS label depth of 3 in ingress
    direction

    The openvswitch was supporting a MPLS label depth of 1 in the
    ingress direction though the userspace OVS supports a max depth
    of 3 labels.This change enables openvswitch module to support a
    max depth of 3 labels in the ingress.

Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoflow: Fix IPv6 header parser with partial offloading.
Zhike Wang [Fri, 8 Nov 2019 09:02:44 +0000 (17:02 +0800)]
flow: Fix IPv6 header parser with partial offloading.

Set nw_proto before it is used in parse_ipv6_ext_hdrs__().

Signed-off-by: Zhike Wang <wangzk320@163.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agolacp: Add support to recognize LACP Marker RX PDUs.
Nitin katiyar [Tue, 12 Nov 2019 08:08:59 +0000 (09:08 +0100)]
lacp: Add support to recognize LACP Marker RX PDUs.

OVS does not support the LACP Marker protocol. Typically, ToR switches
send a LACP Marker PDU when restarting LACP negotiation following a link
flap or LACP timeout.

When a LACP Marker PDU is received, OVS logs the following warning and
drops the packet:
    “lacp(pmdXXX)|WARN|bond-prv: received an unparsable LACP PDU.”

As the above message is logged around the same time the link flap or
LACP down events are logged, it gives a misleading impression that the
reception of an unparsable LACP PDU is the reason for the LACP down
event.

The proposed patch does not add support for the LACP Marker protocol.
It simply recognizes LACP Marker packets, drops them and logs a clear
message indicating that a Marker packet was a received. A counter to
track the number of such packets received is also added.

Signed-off-by: Nitin katiyar <nitin.katiyar@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoofproto-dpif: Allow IPv6 ND Extensions only if supported
Flavio Leitner [Wed, 20 Nov 2019 14:21:13 +0000 (11:21 -0300)]
ofproto-dpif: Allow IPv6 ND Extensions only if supported

The IPv6 ND Extensions is only implemented in userspace datapath,
but nothing prevents that to be used with other datapaths.

This patch probes the datapath and only allows if the support
is available.

Fixes: 9b2b84973 ("Support for match & set ICMPv6 reserved and options type fields")
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoAUTHORS: Add Wang Li.
Ben Pfaff [Fri, 22 Nov 2019 00:48:45 +0000 (16:48 -0800)]
AUTHORS: Add Wang Li.

Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoipf: bail out when ipf state is COMPLETED
Li RongQing [Thu, 14 Nov 2019 09:18:18 +0000 (17:18 +0800)]
ipf: bail out when ipf state is COMPLETED

it is easy to crash ovs when a packet with same id
hits a list that already reassembled completedly
but have not been sent out yet, and this packet is
not duplicate with this hit ipf list due to bigger
offset

    1  0x00007f9fef0ae2d9 in __GI_abort () at abort.c:89
    2  0x0000000000464042 in ipf_list_state_transition at lib/ipf.c:545

Fixes: 4ea96698f667 ("Userspace datapath: Add fragmentation handling.")
Co-authored-by: Wang Li <wangli39@baidu.com>
Signed-off-by: Wang Li <wangli39@baidu.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoovsdb raft: Fix election timer parsing in snapshot RPC.
Han Zhou [Wed, 13 Nov 2019 17:33:59 +0000 (09:33 -0800)]
ovsdb raft: Fix election timer parsing in snapshot RPC.

Commit a76ba825 took care of saving and restoring election timer in
file header snapshot, but it didn't handle the parsing of election
timer in install_snapshot_request/reply RPC, which results in problems,
e.g. when election timer change log is compacted in snapshot and then a
new node join the cluster, the new node will use the default timer
instead of the new value.  This patch fixed it by parsing election
timer in snapshot RPC.

At the same time the patch updates the test case to cover the DB compact and
join senario. The test reveals another 2 problems related to clustered DB
compact, as commented in the test case's XXX, which need to be addressed
separately.

Signed-off-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agojsonrpc: increase input buffer size from 512 to 4096
Lorenzo Bianconi [Wed, 6 Nov 2019 09:19:44 +0000 (11:19 +0200)]
jsonrpc: increase input buffer size from 512 to 4096

Increase jsonrpc input buffer size from 512 to 4096 bytes in order to
reduce the syscall overhead when downloading huge db size

Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agonetdev-afxdp: Enable libbpf logging to OVS.
William Tu [Wed, 20 Nov 2019 20:25:56 +0000 (12:25 -0800)]
netdev-afxdp: Enable libbpf logging to OVS.

libbpf has pr_warn, pr_info, and pr_debug. The patch registers
these print functions, integrating the libbpf logs to OVS log.

Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
4 years agoofproto-dpif: Expose datapath capability to ovsdb.
William Tu [Fri, 4 Oct 2019 20:48:58 +0000 (13:48 -0700)]
ofproto-dpif: Expose datapath capability to ovsdb.

The patch adds support for fetching the datapath's capabilities
from the result of 'check_support()', and write the supported capability
to a new database column, called 'capabilities' under Datapath table.

To see how it works, run:
 # ovs-vsctl -- add-br br0 -- set Bridge br0 datapath_type=netdev
 # ovs-vsctl -- --id=@m create Datapath datapath_version=0 \
     'ct_zones={}' 'capabilities={}' \
     -- set Open_vSwitch . datapaths:"netdev"=@m

 # ovs-vsctl list-dp-cap netdev
 ufid=true sample_nesting=true clone=true tnl_push_pop=true \
 ct_orig_tuple=true ct_eventmask=true ct_state=true \
 ct_clear=true max_vlan_headers=1 recirc=true ct_label=true \
 max_hash_alg=1 ct_state_nat=true ct_timeout=true \
 ct_mark=true ct_orig_tuple6=true check_pkt_len=true \
 masked_set_action=true max_mpls_depth=3 trunc=true ct_zone=true

Signed-off-by: William Tu <u9012063@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
---
v5:
    Add improved documentation from Ben and
    fix checkpatch error (tab and line 79 char)
v4:
    rebase to master
v3:
    fix 32-bit build, reported by Greg
    travis: https://travis-ci.org/williamtu/ovs-travis/builds/599276267
v2:
rebase to master

4 years agoovsdb-server: fix memory leak while deleting zone
Damijan Skvarc [Tue, 12 Nov 2019 11:32:35 +0000 (12:32 +0100)]
ovsdb-server: fix memory leak while deleting zone

memory leak was detected by valgrind during execution
of "database commands -- positive checks" test.

leaked memory was allocated in ovsdb_execute_mutate() function
while parsing mutations from the apparent json entity:

==19563==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19563==    by 0x4652D0: xmalloc (util.c:138)
==19563==    by 0x46539E: xmemdup0 (util.c:168)
==19563==    by 0x4653F7: xstrdup (util.c:177)
==19563==    by 0x450379: ovsdb_base_type_clone (ovsdb-types.c:208)
==19563==    by 0x450F8D: ovsdb_type_clone (ovsdb-types.c:550)
==19563==    by 0x428C3F: ovsdb_mutation_from_json (mutation.c:108)
==19563==    by 0x428F6B: ovsdb_mutation_set_from_json (mutation.c:187)
==19563==    by 0x42578D: ovsdb_execute_mutate (execution.c:573)
==19563==    by 0x4246B0: ovsdb_execute_compose (execution.c:171)
==19563==    by 0x41CDE5: ovsdb_trigger_try (trigger.c:204)
==19563==    by 0x41C8DF: ovsdb_trigger_init (trigger.c:61)
==19563==    by 0x40E93C: ovsdb_jsonrpc_trigger_create (jsonrpc-server.c:1135)
==19563==    by 0x40E20C: ovsdb_jsonrpc_session_got_request (jsonrpc-server.c:1002)
==19563==    by 0x40D1C2: ovsdb_jsonrpc_session_run (jsonrpc-server.c:561)
==19563==    by 0x40D31E: ovsdb_jsonrpc_session_run_all (jsonrpc-server.c:591)
==19563==    by 0x40CD6E: ovsdb_jsonrpc_server_run (jsonrpc-server.c:406)
==19563==    by 0x40627E: main_loop (ovsdb-server.c:209)
==19563==    by 0x406E66: main (ovsdb-server.c:460)

This memory is usually freed at the end of ovsdb_execute_mutate()
however in the aforementioned test case this does not happen. Namely
in case of delete mutator and in case of error while calling ovsdb_datum_from_json()
apparent mutation was marked as invalid, what prevents freeing problematic memory.

Memory leak can be reproduced quickly with the following command sequence:
ovs-vsctl --no-wait -vreconnect:emer add-zone-tp netdev zone=1 icmp_first=1 icmp_reply=2
ovs-vsctl --no-wait -vreconnect:emer del-zone-tp netdev zone=1

Signed-off-by: Damijan Skvarc <damjan.skvarc@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agocompat: Add missing inline keyword
Greg Rose [Tue, 5 Nov 2019 22:14:24 +0000 (14:14 -0800)]
compat: Add missing inline keyword

The missing inline keyword before the definition of the
rpl_nf_ct_tmpl_free() function causes spurious warnings about the
function not being used on some older kernels.  Add the keyword
to suppress the warning.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoovs-actions: Clarify documentation for stack usage with group buckets.
Ben Pfaff [Wed, 20 Nov 2019 19:53:46 +0000 (11:53 -0800)]
ovs-actions: Clarify documentation for stack usage with group buckets.

This should be less confusing now.

Reported-by: Han Zhou <hzhou@ovn.org>
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agonetdev-afxdp: Best-effort configuration of XDP mode.
Ilya Maximets [Wed, 6 Nov 2019 21:38:33 +0000 (21:38 +0000)]
netdev-afxdp: Best-effort configuration of XDP mode.

Until now there was only two options for XDP mode in OVS: SKB or DRV.
i.e. 'generic XDP' or 'native XDP with zero-copy enabled'.

Devices like 'veth' interfaces in Linux supports native XDP, but
doesn't support zero-copy mode.  This case can not be covered by
existing API and we have to use slower generic XDP for such devices.
There are few more issues, e.g. TCP is not supported in generic XDP
mode for veth interfaces due to kernel limitations, however it is
supported in native mode.

This change introduces ability to use native XDP without zero-copy
along with best-effort configuration option that enabled by default.
In best-effort case OVS will sequentially try different modes starting
from the fastest one and will choose the first acceptable for current
interface.  This will guarantee the best possible performance.

If user will want to choose specific mode, it's still possible by
setting the 'options:xdp-mode'.

This change additionally changes the API by renaming the configuration
knob from 'xdpmode' to 'xdp-mode' and also renaming the modes
themselves to be more user-friendly.

The full list of currently supported modes:
  * native-with-zerocopy - former DRV
  * native               - new one, DRV without zero-copy
  * generic              - former SKB
  * best-effort          - new one, chooses the best available from
                           3 above modes

Since 'best-effort' is a default mode, users will not need to
explicitely set 'xdp-mode' in most cases.

TCP related tests enabled back in system afxdp testsuite, because
'best-effort' will choose 'native' mode for veth interfaces
and this mode has no issues with TCP.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
4 years agodpdk: Deprecate pdump support.
Ilya Maximets [Mon, 11 Nov 2019 18:52:56 +0000 (19:52 +0100)]
dpdk: Deprecate pdump support.

The conventional way for packet dumping in OVS is to use ovs-tcpdump
that works via traffic mirroring.  DPDK pdump could probably be used
for some lower level debugging, but it is not commonly used for
various reasons.

There are lots of limitations for using this functionality in practice.
Most of them connected with running secondary pdump process and
memory layout issues like requirement to disable ASLR in kernel.
More details are available in DPDK guide:
https://doc.dpdk.org/guides/prog_guide/multi_proc_support.html#multi-process-limitations

Beside the functional limitations it's also hard to use this
functionality correctly.  User must be sure that OVS and pdump utility
are running on different CPU cores, which is hard because non-PMD
threads could float over available CPU cores.  This or any other
misconfiguration will likely lead to crash of the pdump utility
or/and OVS.

Another problem is that the user must actually have this special pdump
utility in a system and it might be not available in distributions.

This change disables pdump support by default introducing special
configuration option '--enable-dpdk-pdump'.  Deprecation warnings will
be shown to users on configuration and in runtime.

Claiming to completely remove this functionality from OVS in one
of the next releases.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
4 years agonetdev-afxdp: add afxdp specific maximum MTU check
Eelco Chaudron [Tue, 12 Nov 2019 09:46:09 +0000 (04:46 -0500)]
netdev-afxdp: add afxdp specific maximum MTU check

Drivers natively supporting AF_XDP will check that a configured MTU size
will not exceed the allowed size for AF_XDP. However, when the skb
compatibility mode is used there is no check and any value is accepted.
This, for example, is the case when using the TAP interface.

This fix adds a check to make sure only AF_XDP valid values are excepted.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: William Tu <u9012063@gmail.com>
4 years agodpif-netdev: Log rxq assignment for isolated pmd.
Gowrishankar Muthukrishnan [Sat, 9 Nov 2019 03:11:27 +0000 (08:41 +0530)]
dpif-netdev: Log rxq assignment for isolated pmd.

There is no log about isolated rxq assignment in a pmd today, which
sometimes could be useful to trace rxq/pmd pinning, when debugging
with log. Ovs-appctl dpif-netdev/pmd-rxq-show reports about it
already, but logging is helpful to trace pinning in time.

Signed-off-by: Gowrishankar Muthukrishnan <gmuthukr@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
4 years agorhel: Fix ovs-kmod-manage.sh that may create invalid soft links
Yifeng Sun [Mon, 18 Nov 2019 20:06:26 +0000 (12:06 -0800)]
rhel: Fix ovs-kmod-manage.sh that may create invalid soft links

Current code iterates every kernel under '/lib/modules' for a matched
version. As a result, this script may create invalid soft links if the
matched kernel doesn't have openvswitch-kmod RPM installed.

This patch fixes it.

VMWare-BZ: #2257534

Fixes: c3570519 ("rhel: add 4.4 kernel in kmod build with mulitple versions, fedora")
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
4 years agoovs-bugtool: Script to collect the port statistics.
Sriram Vatala [Tue, 29 Oct 2019 14:50:06 +0000 (20:20 +0530)]
ovs-bugtool: Script to collect the port statistics.

Sometimes, analysing the drop statistics of the ports
will be helpful in debugging. This patch adds script
to collect all supported port stats which also includes
the drop counters in userspace datapath. The output of
this script is included in the bugtool output.

Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Sriram Vatala <sriram.v@altencalsoftlabs.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
4 years agonetdev-dpdk: Detailed packet drop statistics.
Sriram Vatala [Tue, 29 Oct 2019 14:50:05 +0000 (20:20 +0530)]
netdev-dpdk: Detailed packet drop statistics.

OVS may be unable to transmit packets for multiple reasons on
the userspace datapath and today there is a single counter to
track packets dropped due to any of those reasons. This patch
adds custom software stats for the different reasons packets
may be dropped during tx/rx on the userspace datapath in OVS.

- MTU drops : drops that occur due to a too large packet size
- Qos drops : drops that occur due to egress/ingress QOS
- Tx failures: drops as returned by the DPDK PMD send function

Note that the reason for tx failures is not specified in OVS.
In practice for vhost ports it is most common that tx failures
are because there are not enough available descriptors,
which is usually caused by misconfiguration of the guest queues
and/or because the guest is not consuming packets fast enough
from the queues.

These counters are displayed along with other stats in
"ovs-vsctl get interface <iface> statistics" command and are
available for dpdk and vhostuser/vhostuserclient ports.

Also the existing "tx_retries" counter for vhost ports has been
renamed to "ovs_tx_retries", so that all the custom statistics
that OVS accumulates itself will have the prefix "ovs_". This
will prevent any custom stats names overlapping with
driver/HW stats.

Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Sriram Vatala <sriram.v@altencalsoftlabs.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
4 years agonetdev-dpdk: Reuse vhost function for dpdk ETH custom stats.
Ilya Maximets [Tue, 29 Oct 2019 14:50:04 +0000 (20:20 +0530)]
netdev-dpdk: Reuse vhost function for dpdk ETH custom stats.

This is yet another refactoring for upcoming detailed drop stats.
It allows to use single function for all the software calculated
statistics in netdev-dpdk for both vhost and ETH ports.

UINT64_MAX used as a marker for non-supported statistics in a
same way as it's done in bridge.c for common netdev stats.

Co-authored-by: Sriram Vatala <sriram.v@altencalsoftlabs.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Sriram Vatala <sriram.v@altencalsoftlabs.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
4 years agotc: Set 'no_percpu' flag for compatible actions
Vlad Buslov [Mon, 4 Nov 2019 16:34:49 +0000 (18:34 +0200)]
tc: Set 'no_percpu' flag for compatible actions

Recent changes in Linux kernel TC action subsystem introduced new
TCA_ACT_FLAGS_NO_PERCPU_STATS flag. The purpose of the flag is to request
action implementation to skip allocating action stats with expensive percpu
allocator and use regular built-in action stats instead. Such approach
significantly improves rule insertion rate and reduce memory usage for
hardware-offloaded rules that don't need benefits provided by percpu
allocated stats (improved software TC fast-path performance). Set the flag
for all compatible actions.

Modify acinclude.m4 to use OVS-internal pkt_cls.h implementation when
TCA_ACT_FLAGS is not defined by kernel headers and to manually define
struct nla_bitfield32 in netlink.h (new file) when it is not defined by
kernel headers.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
4 years agocompat: Add compat fix for old kernels
Roi Dayan [Wed, 6 Nov 2019 07:34:46 +0000 (09:34 +0200)]
compat: Add compat fix for old kernels

In kernels older than 4.8, struct tcf_t didn't have the firstuse.
If openvswitch is compiled with the compat pkt_cls.h then there is
a struct size mismatch between openvswitch and the kernel which cause
parsing netlink actions to fail.
After this commit parsing the netlink actions pass even if compiled with
the compat pkt_cls.h.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
4 years agonetdev-dpdk: Track vhost tx contention.
David Marchand [Mon, 26 Aug 2019 14:33:17 +0000 (16:33 +0200)]
netdev-dpdk: Track vhost tx contention.

Add a coverage counter to help diagnose contention on the vhost txqs.
This is seen as dropped packets on the physical ports for rates that
are usually handled fine by OVS.

Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
4 years agotests: Wait up to OVS_CTL_TIMEOUT seconds.
Ilya Maximets [Wed, 6 Nov 2019 16:29:58 +0000 (17:29 +0100)]
tests: Wait up to OVS_CTL_TIMEOUT seconds.

While running tests under valgrind, it could take more than 10 seconds
for process to disappear after successful 'ovs-appctl exit' command.

Same applies to some other events that tests are waiting for with
OVS_WAIT macro.  This makes tests to fail frequently under valgrind.

Using OVS_CTL_TIMEOUT variable instead of constant 10 seconds seems
reasonable to avoid this issue because it controls timeouts of all
control utilities and needs to be adjusted while running under valgrind
anyway.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: William Tu <u9012063@gmail.com>
4 years agovswitch.xml: Fix column for xdpmode.
Ilya Maximets [Tue, 5 Nov 2019 20:54:08 +0000 (21:54 +0100)]
vswitch.xml: Fix column for xdpmode.

'xdpmode' is part of 'options', not the 'other_config'.

CC: William Tu <u9012063@gmail.com>
Fixes: 0de1b425962d ("netdev-afxdp: add new netdev type for AF_XDP.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
4 years agoip_gre: Remove even more unused code
Greg Rose [Mon, 4 Nov 2019 20:43:47 +0000 (12:43 -0800)]
ip_gre: Remove even more unused code

There is a confusing mix of ipgre and gretap functions with some
needed for gretap still having ipgre_ prefixes.  This time though
I think I got the rest of the unused ipgre code.

Passes Travis here and this time I made sure the patch passing
Travis is the same one I'm mailing.
https://travis-ci.org/gvrose8192/ovs-experimental/builds/607296133

Fixes: d5822f428814 ("gre: Remove dead ipgre code")
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoAUTHORS: Add Tomasz Konieczny.
Ilya Maximets [Mon, 4 Nov 2019 19:27:01 +0000 (20:27 +0100)]
AUTHORS: Add Tomasz Konieczny.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
4 years agonetdev-dpdk: Fix flow control not configuring.
Tomasz Konieczny [Thu, 12 Sep 2019 10:43:20 +0000 (12:43 +0200)]
netdev-dpdk: Fix flow control not configuring.

Currently OVS is unable to change flow control configuration in DPDK
because new settings are being overwritten by current settings with
rte_eth_dev_flow_ctrl_get(). The fix restores correct order of
operations and at the same time does not trigger error on devices
without flow control support when flow control not requested.

Fixes: 7e1de65e8dfb ("netdev-dpdk: Fix failure to configure flow control at netdev-init.")
Signed-off-by: Tomasz Konieczny <tomaszx.konieczny@intel.com>
Co-authored-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
4 years agofaq: Fix meter action releases.
Darrell Ball [Sat, 2 Nov 2019 19:05:34 +0000 (12:05 -0700)]
faq: Fix meter action releases.

At the same time disambiguate some feature descriptions.
'Meters' is changed to 'Meter action' to clarify that the entry
describes the Openflow meter action rather than port based meters.
'NAT' is changed to 'Conntrack NAT' to indicate that this entry
represents NAT done in 'conntrack', rather than basic Openflow
IP address and L4 port modifications.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agolib/tc: Fix flow dump for tunnel id equal zero
Dmytro Linkin [Wed, 30 Oct 2019 12:40:35 +0000 (14:40 +0200)]
lib/tc: Fix flow dump for tunnel id equal zero

Tunnel id 0 is not printed unless tunnel flag FLOW_TNL_F_KEY is set.
Fix that by always setting FLOW_TNL_F_KEY when tunnel id is valid.

Fixes: 0227bf092ee6 ("lib/tc: Support optional tunnel id")
Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
4 years agotravis: Fix skipping of python SSL tests.
Ilya Maximets [Fri, 1 Nov 2019 22:16:28 +0000 (23:16 +0100)]
travis: Fix skipping of python SSL tests.

After this change we'll have only one windows related skipped test
in default build.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
4 years agotravis: Workaround skipping of IPv6 tests.
Ilya Maximets [Fri, 1 Nov 2019 21:53:03 +0000 (22:53 +0100)]
travis: Workaround skipping of IPv6 tests.

IPv6 support disabled in TravisCI images but supported by kernel.
So, we could enable it in order to not skip unit tests.
We are not trying to communicate over network with IPv6, so this
should not make any harm.

Related issue: https://github.com/travis-ci/travis-ci/issues/8891

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
4 years agoofp-monitor: Fixed the usage of 'usable_protocols' variable in 'parse_flow_monitor_re...
Ashish Varma [Tue, 23 Jul 2019 20:02:10 +0000 (13:02 -0700)]
ofp-monitor: Fixed the usage of 'usable_protocols' variable in 'parse_flow_monitor_request' function.

'usable_protocols' is now getting set to OFPUTIL_P_OF10_ANY on return from
'parse_flow_monitor_request' function. The calling function now checks for the
value in this variable against the 'allowed_protocols' variable.
Also a check is added for a match field which is not supported in OpenFlow 1.0
and return an error.
Modified the man page of ovs-ofctl to reflect Flow Monitor support as
OpenFlow 1.0 Nicira extension only.

Signed-off-by: Ashish Varma <ashishvarma.ovs@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoRevert "ip_gre: Remove even more unused code"
Greg Rose [Fri, 1 Nov 2019 16:07:54 +0000 (09:07 -0700)]
Revert "ip_gre: Remove even more unused code"

This reverts commit 42a059e02bf343787951be2824c579e1c9a26e12.

Not all the necessary ipgre prefixed code was removed that
should have been.  Another patch will follow with the correct
removed code.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
4 years agotravis: Enable pdump for DPDK build.
Ilya Maximets [Wed, 30 Oct 2019 18:39:46 +0000 (19:39 +0100)]
travis: Enable pdump for DPDK build.

OVS has support for DPDK pdump that checked in configure script.
Enabling it to increase OVS build test coverage by the code guarded
by DPDK_PDUMP macro.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: David Marchand <david.marchand@redhat.com>
4 years agoip_gre: Remove even more unused code
Greg Rose [Thu, 31 Oct 2019 22:46:04 +0000 (15:46 -0700)]
ip_gre: Remove even more unused code

There is a confusing mix of ipgre and gretap functions with some
needed for gretap still having ipgre_ prefixes.  This time though
I think I got the rest of the unused ipgre code.

Fixes: d5822f428814 ("gre: Remove dead ipgre code")
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoip_gre: Removed unused ipgre netdev ops
Greg Rose [Thu, 31 Oct 2019 20:30:39 +0000 (13:30 -0700)]
ip_gre: Removed unused ipgre netdev ops

When cleaning up unused ipgre code the ipgre_netdev_ops structure
was missed. Get rid of it now.

Fixes: d5822f428814 ("gre: Remove dead ipgre code")
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoAvoid indeterminate statistics in offload implementations.
Ben Pfaff [Fri, 25 Oct 2019 18:46:24 +0000 (11:46 -0700)]
Avoid indeterminate statistics in offload implementations.

A lot of the offload implementations didn't bother to initialize the
statistics they were supposed to return.  I don't know whether any of
the callers actually use them, but it looked wrong.

Found by inspection.

Acked-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoAUTHORS: Add Gowrishankar Muthukrishnan.
Ben Pfaff [Wed, 30 Oct 2019 17:51:44 +0000 (10:51 -0700)]
AUTHORS: Add Gowrishankar Muthukrishnan.

Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agolacp: warn transmit failure of lacp pdu
Gowrishankar Muthukrishnan [Mon, 21 Oct 2019 14:04:36 +0000 (19:34 +0530)]
lacp: warn transmit failure of lacp pdu

It might be difficult to trace whether LACP PDU tx (as in
response) was successful when the pdu was not transmitted by
egress slave for various reasons (including resource contention
within NIC) and only way to trace its fate is by looking at
ofproto->stats.tx_[packets/bytes] and slave port stats.

Adding a warning when there is tx failure could help user
debug at the root of this problem.

Signed-off-by: Gowrishankar Muthukrishnan <gmuthukr@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoovsdb-execute: Remove unused variable from ovsdb_execute_mutate().
Damijan Skvarc [Tue, 29 Oct 2019 13:04:58 +0000 (14:04 +0100)]
ovsdb-execute: Remove unused variable from ovsdb_execute_mutate().

Signed-off-by: Damijan Skvarc <damjan.skvarc@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agonetdev-afxdp: Add need_wakeup support.
William Tu [Wed, 23 Oct 2019 21:06:01 +0000 (14:06 -0700)]
netdev-afxdp: Add need_wakeup support.

The patch adds support for using need_wakeup flag in AF_XDP rings.
A new option, use-need-wakeup, is added.  When this option is used,
it means that OVS has to explicitly wake up the kernel RX, using poll()
syscall and wake up TX, using sendto() syscall. This feature improves
the performance by avoiding unnecessary sendto syscalls for TX.
For RX, instead of kernel always busy-spinning on fille queue, OVS wakes
up the kernel RX processing when fill queue is replenished.

The need_wakeup feature is merged into Linux kernel bpf-next tee with commit
77cd0d7b3f25 ("xsk: add support for need_wakeup flag in AF_XDP rings") and
OVS enables it by default, if libbpf supports it.  If users enable it but
runs in an older version of libbpf, then the need_wakeup feature has no effect,
and a warning message is logged.

For virtual interface, it's better set use-need-wakeup=false, since
the virtual device's AF_XDP xmit is synchronous: the sendto syscall
enters kernel and process the TX packet on tx queue directly.

On Intel Xeon E5-2620 v3 2.4GHz system, performance of physical port
to physical port improves from 6.1Mpps to 7.3Mpps.

Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
4 years agorhel: openvswitch-fedora.spec.in: Fix output redirect to null device
Roi Dayan [Mon, 28 Oct 2019 08:37:44 +0000 (10:37 +0200)]
rhel: openvswitch-fedora.spec.in: Fix output redirect to null device

Add missing slash.

Fixes: 0447019df7c6 ("fedora-spec: added systemd post/postun/pre/preun sections")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
4 years agodpif-netdev: Fix time delta overflow in case of race for meter lock.
Ilya Maximets [Thu, 24 Oct 2019 13:15:07 +0000 (15:15 +0200)]
dpif-netdev: Fix time delta overflow in case of race for meter lock.

There is a race window between getting the time and getting the meter
lock.  This could lead to situation where the thread with larger
current time (this thread called time_{um}sec() later than others)
will acquire meter lock first and update meter->used to the large
value.  Next threads will try to calculate time delta by subtracting
the large meter->used from their lower time getting the negative value
which will be converted to a big unsigned delta.

Fix that by assuming that all these threads received packets in the
same time in this case, i.e. dropping negative delta to 0.

CC: Jarno Rajahalme <jarno@ovn.org>
Fixes: 4b27db644a8c ("dpif-netdev: Simple DROP meter implementation.")
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-September/363126.html
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
4 years agodpif-netdev: Do not mix recirculation depth into RSS hash itself.
Ilya Maximets [Thu, 24 Oct 2019 10:55:15 +0000 (12:55 +0200)]
dpif-netdev: Do not mix recirculation depth into RSS hash itself.

Mixing of RSS hash with recirculation depth is useful for flow lookup
because same packet after recirculation should match with different
datapath rule.  Setting of the mixed value back to the packet is
completely unnecessary because recirculation depth is different on
each recirculation, i.e. we will have different packet hash for
flow lookup anyway.

This should fix the issue that packets from the same flow could be
directed to different buckets based on a dp_hash or different ports of
a balanced bonding in case they were recirculated different number of
times (e.g. due to conntrack rules).
With this change, the original RSS hash will remain the same making
it possible to calculate equal dp_hash values for such packets.

Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-September/363127.html
Fixes: 048963aa8507 ("dpif-netdev: Reset RSS hash when recirculating.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Jan Scheurich <jan.scheurich@ericsson.com>
4 years agocommand-line: New function ovs_cmdl_env_parse_all().
Aliasgar Ginwala [Fri, 25 Oct 2019 19:33:52 +0000 (12:33 -0700)]
command-line: New function ovs_cmdl_env_parse_all().

This function allows an environment variable to be included in
command-line parsing.  It will receive its first user in an
upcoming commit.

Signed-off-by: Aliasgar Ginwala <aginwala@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoovsdb-server: fix memory leak while converting database
Damijan Skvarc [Fri, 25 Oct 2019 12:22:58 +0000 (14:22 +0200)]
ovsdb-server: fix memory leak while converting database

Memory leak happens while converting existing database into new
database according to the specified schema (ovsdb-client convert
new-schema). Memory leak was detected by valgrind while executing
functional test "schema conversion online - clustered"

==16202== 96 bytes in 6 blocks are definitely lost in loss record 326 of 399
==16202==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16202==    by 0x44A5D4: xmalloc (util.c:138)
==16202==    by 0x4377A6: alloc_default_atoms (ovsdb-data.c:315)
==16202==    by 0x437F18: ovsdb_datum_init_default (ovsdb-data.c:918)
==16202==    by 0x413D82: ovsdb_row_create (row.c:59)
==16202==    by 0x40AA53: ovsdb_convert_table (file.c:220)
==16202==    by 0x40AA53: ovsdb_convert (file.c:275)
==16202==    by 0x416BE1: ovsdb_trigger_try (trigger.c:255)
==16202==    by 0x40D29E: ovsdb_jsonrpc_trigger_create (jsonrpc-server.c:1119)
==16202==    by 0x40D29E: ovsdb_jsonrpc_session_got_request (jsonrpc-server.c:986)
==16202==    by 0x40D29E: ovsdb_jsonrpc_session_run (jsonrpc-server.c:556)
==16202==    by 0x40D29E: ovsdb_jsonrpc_session_run_all (jsonrpc-server.c:586)
==16202==    by 0x40D29E: ovsdb_jsonrpc_server_run (jsonrpc-server.c:401)
==16202==    by 0x40682E: main_loop (ovsdb-server.c:209)
==16202==    by 0x40682E: main (ovsdb-server.c:460)

The problem was in ovsdb_datum_convert() function, which overrides
pointers to datum memory allocated in ovsdb_row_create() function.
Fix was done by freeing this memory before ovsdb_datum_convert()
is called.

Signed-off-by: Damijan Skvarc <damjan.skvarc@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agodocs: To build OVS on RHEL7 EPEL is needed
Timothy Redaelli [Fri, 25 Oct 2019 15:41:22 +0000 (17:41 +0200)]
docs: To build OVS on RHEL7 EPEL is needed

Since Python 3 is now mandatory, Extra Packages for Enterprise Linux
(EPEL) repository is needed in order to build OVS on RHEL7.

Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoflow: Fix crash on vlan packets with partial offloading.
Ilya Maximets [Wed, 23 Oct 2019 20:26:52 +0000 (22:26 +0200)]
flow: Fix crash on vlan packets with partial offloading.

parse_tcp_flags() does not care about vlan tags in a packet thus
not able to parse them.  As a result, if partial offloading is
enabled in userspace datapath vlan packets are not parsed, i.e.
has no initialized offsets.  This causes OVS crash on any attempt
to access/modify packet header fields.

For example, having the flow with following actions:
  in_port=1,ip,actions=mod_nw_src:192.168.0.7,output:IN_PORT

will lead to OVS crash on vlan packet handling:

 Process terminating with default action of signal 11 (SIGSEGV)
 Invalid read of size 4
    at 0x785657: get_16aligned_be32 (unaligned.h:249)
    by 0x785657: odp_set_ipv4 (odp-execute.c:82)
    by 0x785657: odp_execute_masked_set_action (odp-execute.c:527)
    by 0x785657: odp_execute_actions (odp-execute.c:894)
    by 0x74CDA9: dp_netdev_execute_actions (dpif-netdev.c:7355)
    by 0x74CDA9: packet_batch_per_flow_execute (dpif-netdev.c:6339)
    by 0x74CDA9: dp_netdev_input__ (dpif-netdev.c:6845)
    by 0x74DB6E: dp_netdev_input (dpif-netdev.c:6854)
    by 0x74DB6E: dp_netdev_process_rxq_port (dpif-netdev.c:4287)
    by 0x74E863: dpif_netdev_run (dpif-netdev.c:5264)
    by 0x703F57: type_run (ofproto-dpif.c:370)
    by 0x6EC8B8: ofproto_type_run (ofproto.c:1760)
    by 0x6DA52B: bridge_run__ (bridge.c:3188)
    by 0x6E083F: bridge_run (bridge.c:3252)
    by 0x1642E4: main (ovs-vswitchd.c:127)
  Address 0xc is not stack'd, malloc'd or (recently) free'd

Fix that by properly parsing vlan tags first.  Function 'parse_dl_type'
transformed for that purpose as it had no users anyway.

Added unit test for packet modification with partial offloading that
triggers above crash.

Fixes: aab96ec4d81e ("dpif-netdev: retrieve flow directly from the flow mark")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
4 years agotests: Fix indentation in userspace packet type aware test.
Ilya Maximets [Thu, 24 Oct 2019 12:28:50 +0000 (12:28 +0000)]
tests: Fix indentation in userspace packet type aware test.

CC: Ben Pfaff <blp@ovn.org>
Fixes: 7be29a47576d ("ofproto-dpif: Remove tabs from output.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
4 years agorhel: Support RHEL7.7 build and packaging
Yifeng Sun [Fri, 11 Oct 2019 21:49:14 +0000 (14:49 -0700)]
rhel: Support RHEL7.7 build and packaging

This patch provides essential fixes for OVS to support
RHEL7.7's new kernel.

make rpm-fedora-kmod \
RPMBUILD_OPT='-D "kversion 3.10.0-1062.1.2.el7.x86_64"'

Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoovsdb-server: Allow replication from older schema version servers.
Numan Siddique [Mon, 21 Oct 2019 16:56:51 +0000 (22:26 +0530)]
ovsdb-server: Allow replication from older schema version servers.

Presently, replication is not allowed if there is a schema version mismatch between
the schema returned by the active ovsdb-server and the local db schema. This is
causing failures in OVN DB HA deployments during uprades.

In the case of OpenStack tripleo deployment with OVN, OVN DB ovsdb-servers are
deployed on a multi node controller cluster in active/standby mode. During
minor updates or major upgrades, the cluster is updated one at a time. If
a node A is running active OVN DB ovsdb-servers and when it is updated, another
node B becomes active. After the update when OVN DB ovsdb-servers in A are started,
these ovsdb-servers fail to replicate from the active if there is a schema
version mismatch.

This patch addresses this issue by allowing replication even if there is a
schema version mismatch only if all the active db schema tables and its colums are
present in the local db schema.

This should not result in any data loss.

Signed-off-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agolldp: Fix for OVS crashes when a LLDP-enabled port is deleted
Surya Rudra [Mon, 21 Oct 2019 07:12:02 +0000 (12:42 +0530)]
lldp: Fix for OVS crashes when a LLDP-enabled port is deleted

Issue:
When LLDP is enabled on a port, a structure to hold LLDP related state
is created and that structure has a reference to the port. The ofproto
monitor thread accesses the LLDP structure to periodically send packets
over the associated port. When the port is deleted, the LLDP structure
is not cleaned up and it continues to refer to the deleted port.

When the monitor thread attempts to access the deleted port OVS crashes.
Crash can happen with bridge delete and bond delete also.

Fix:
Remove all references to the LLDP structure and free it when
the port is deleted.

Signed-off-by: Surya Rudra <rudrasurya.r@altencalsoftlabs.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agodocs: DPDK isn't a datapath, so don't use the term.
Ben Pfaff [Wed, 23 Oct 2019 17:19:39 +0000 (10:19 -0700)]
docs: DPDK isn't a datapath, so don't use the term.

The DPDK library allows OVS fast access to packet I/O in userspace.  It
is not a datapath.  This commit avoids using that term.

Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agofaq: Give specific versions that introduced various features.
Ben Pfaff [Wed, 23 Oct 2019 17:19:38 +0000 (10:19 -0700)]
faq: Give specific versions that introduced various features.

Some users would find it useful to know the particular OVS version that
introduced a feature to the OVS tree kernel module or to the OVS
userspace (DPDK) datapath implementation.  This patch updates the FAQ
to include that information.

This information is primarily gleaned from the top-level NEWS file.
For most of these, I did not verify them by looking carefully through
the history, so some of them may be inaccurate, although a few people
made corrections in review.

Requested-by: Jianjun Shen <shenj@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agolacp: report desync in ovs threads enabling slave
Gowrishankar Muthukrishnan [Tue, 22 Oct 2019 05:29:14 +0000 (10:59 +0530)]
lacp: report desync in ovs threads enabling slave

It is helpful in reporting main thread that is yet to enable bond slave,
but link state was brought up by lacp thread and capture this desync
between ovs threads for debugging.

Fixes: a8448cb170 ("lacp: Avoid packet drop on LACP bond after link up")
Signed-off-by: Gowrishankar Muthukrishnan <gmuthukr@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoovs-tcpundump: allow multiple packet lengths
Aaron Conole [Tue, 22 Oct 2019 14:55:59 +0000 (10:55 -0400)]
ovs-tcpundump: allow multiple packet lengths

The tcpundump tool expects all packets to be a length which aligns to
exactly a 4-nibble boundary.  This means packets like DNS requests will be
stripped before being correctly processed.  Fix this by allowing at least
two nibbles (or one byte) alignment.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agoovs-tcpundump: exit when getting version
Aaron Conole [Tue, 22 Oct 2019 14:55:58 +0000 (10:55 -0400)]
ovs-tcpundump: exit when getting version

Running 'ovs-tcpundump -V' will cause ovs-tcpundump to start processing on
stdin.  Instead, print the version and exit.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agonetdev-dpdk: Fix Tx queue false sharing.
Ilya Maximets [Mon, 26 Aug 2019 14:54:04 +0000 (17:54 +0300)]
netdev-dpdk: Fix Tx queue false sharing.

'tx_q' array is allocated for each DPDK netdev.  'struct dpdk_tx_queue'
is 8 bytes long, so 8 tx queues are sharing the same cache line in
case of 64B cacheline size.  This causes 'false sharing' issue in
mutliqueue case because taking the spinlock implies write to memory
i.e. cache invalidation.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
4 years agotravis: Test build with afxdp.
Ilya Maximets [Mon, 21 Oct 2019 13:33:04 +0000 (15:33 +0200)]
travis: Test build with afxdp.

We can't easily update the kernel on TravisCI to run system tests
with AF_XDP, but we could run build tests with libbpf and headers
from newer kernels.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
4 years agorhel: Remove the cond 'build_python3'
Numan Siddique [Mon, 21 Oct 2019 09:42:42 +0000 (15:12 +0530)]
rhel: Remove the cond 'build_python3'

A previous patch removed python2 support from ovs. So we can remove
this condition and make python3 mandatory for builds. Without this
patch, make rpm-fedora on centos 7 fails unless  we pass
RPMBUILD_OPT="--with build_python3".

Signed-off-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agodebian and rhel: Add libunwind dev package.
William Tu [Thu, 17 Oct 2019 19:55:36 +0000 (12:55 -0700)]
debian and rhel: Add libunwind dev package.

The patch add libunwind dev package to debian and rhel.

Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
4 years agoossfuzz: Simplify miniflow fuzzer harness.
Bhargava Shastry [Fri, 18 Oct 2019 15:17:34 +0000 (17:17 +0200)]
ossfuzz: Simplify miniflow fuzzer harness.

Google's oss-fuzz builder bots were complaining that miniflow_target is
too slow to fuzz in that some tests take longer than a second to
complete. This patch fixes this by replacing the random flow generation
within the harness to a more simpler scenario.

Signed-off-by: Bhargava Shastry <bshas3@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agodatapath: Allow attaching helper in later commit
Yi-Hung Wei [Tue, 15 Oct 2019 17:27:53 +0000 (10:27 -0700)]
datapath: Allow attaching helper in later commit

Upstream commit:
commit 248d45f1e1934f7849fbdc35ef1e57151cf063eb
Author: Yi-Hung Wei <yihung.wei@gmail.com>
Date:   Fri Oct 4 09:26:44 2019 -0700

    openvswitch: Allow attaching helper in later commit

    This patch allows to attach conntrack helper to a confirmed conntrack
    entry.  Currently, we can only attach alg helper to a conntrack entry
    when it is in the unconfirmed state.  This patch enables an use case
    that we can firstly commit a conntrack entry after it passed some
    initial conditions.  After that the processing pipeline will further
    check a couple of packets to determine if the connection belongs to
    a particular application, and attach alg helper to the connection
    in a later stage.

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agodatapath: Fix log message in ovs conntrack
Yi-Hung Wei [Tue, 15 Oct 2019 17:27:52 +0000 (10:27 -0700)]
datapath: Fix log message in ovs conntrack

Upstream commit:
commit 12c6bc38f99bb168b7f16bdb5e855a51a23ee9ec
Author: Yi-Hung Wei <yihung.wei@gmail.com>
Date:   Wed Aug 21 17:16:10 2019 -0700

    openvswitch: Fix log message in ovs conntrack

Fixes: 06bd2bdf19d2 ("openvswitch: Add timeout support to ct action")
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
4 years agodatapath: Replace removed NF_NAT_NEEDED with IS_ENABLED(CONFIG_NF_NAT)
Yi-Hung Wei [Tue, 15 Oct 2019 17:27:51 +0000 (10:27 -0700)]
datapath: Replace removed NF_NAT_NEEDED with IS_ENABLED(CONFIG_NF_NAT)

Backports the following upstream commit with some backward compatibility
change.

commit f319ca6557c10a711facc4dd60197470796d3ec1
Author: Geert Uytterhoeven <geert@linux-m68k.org>
Date:   Wed May 8 08:52:32 2019 +0200

    openvswitch: Replace removed NF_NAT_NEEDED with IS_ENABLED(CONFIG_NF_NAT)

    Commit 4806e975729f99c7 ("netfilter: replace NF_NAT_NEEDED with
    IS_ENABLED(CONFIG_NF_NAT)") removed CONFIG_NF_NAT_NEEDED, but a new user
    popped up afterwards.

Fixes: fec9c271b8f1bde1 ("openvswitch: load and reference the NAT helper.")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Florian Westphal <fw@strlen.de>
Acked-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>