]> git.proxmox.com Git - ovs.git/log
ovs.git
6 years agoRevert "dpif: Ensure ERSPAN GRE support"
Greg Rose [Mon, 4 Jun 2018 20:14:36 +0000 (13:14 -0700)]
Revert "dpif: Ensure ERSPAN GRE support"

This reverts commit 8929c55287abae37efeac1e8876e6b3c2ccad0b9.

This is the wrong direction for the solution to the ip_gre/gre kernel
module conflicts, as reported by Jiri Benc <jbenc@redhat.com> and others in
https://mail.openvswitch.org/pipermail/ovs-dev/2018-June/347803.html and
elsewhere in the same thread

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agocompat: Fix compile warning
Greg Rose [Mon, 4 Jun 2018 20:33:30 +0000 (13:33 -0700)]
compat: Fix compile warning

Fix compile warning about redefined symbol

Fixes: 10f242363d ("compat: Add skb_checksum_simple_complete()")
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoFix typo in database commands documentation.
Mark Michelson [Mon, 4 Jun 2018 14:36:31 +0000 (10:36 -0400)]
Fix typo in database commands documentation.

s/remov/remove/

Signed-off-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agocompat: Add skb_checksum_simple_complete()
Greg Rose [Fri, 1 Jun 2018 20:07:43 +0000 (13:07 -0700)]
compat: Add skb_checksum_simple_complete()

A recent patch to gre.c added a call to skb_checksum_simple_complete()
which is not present in kernels before 3.16.  Fix up the compatability
layer to allow compile on older kernels that do not have it.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoutilities/ovs-ctl: Force removal of ip_gre/gre
Greg Rose [Thu, 31 May 2018 21:20:45 +0000 (14:20 -0700)]
utilities/ovs-ctl: Force removal of ip_gre/gre

On Linux kernels older than 4.16 the user cannot take advantage of
OVS ERSPAN features if the older ip_gre and gre kernel modules are
loaded.  In addition, the openvswitch kernel module will fail to
load because it cannot grab the IPPROTO_GRE inet protocol handler
since the gre kernel module has already taken it.

Update the force_reload_kmod() script function to force removal
of the ip_gre and gre built-in kernel modules so that the openvswitch
kernel module can load and provide support for ERSPAN.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agodpif: Ensure ERSPAN GRE support
Greg Rose [Thu, 31 May 2018 22:50:31 +0000 (15:50 -0700)]
dpif: Ensure ERSPAN GRE support

When verifying the built-in gre kernel module check for ERSPAN support.

Reported-by: Guru Shetty <guru@ovn.org>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agocompat: Fixups for newer kernels
Greg Rose [Thu, 31 May 2018 21:10:10 +0000 (14:10 -0700)]
compat: Fixups for newer kernels

A recent patch series added support for ERSPAN but left some problems
remaining for kernel releases from 4.10 to 4.14.  This patch
addresses those problems.

Of note is that the old cisco gre compat layer code is gone for good.

Also, several compat defines in acinclude.m4 were looking for keys
in .c source files - this does not work on distros without source
code.  A more reliable key was already defined so we use that instead.

We have pared support for the Linux kernel releases in .travis.yml
to reflect that 4.15 is no longer in the LTS list.  With this patch
the Out of Tree OVS datapath kernel modules can build on kernels
up to 4.14.47.  Support for kernels up to 4.16.x will be added
later.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agodatapath: ip6_gre: fix tunnel metadata device sharing.
William Tu [Tue, 29 May 2018 12:55:05 +0000 (05:55 -0700)]
datapath: ip6_gre: fix tunnel metadata device sharing.

commit b80d0b93b991e551a32157e0d9d38fc5bc9348a7
Author: William Tu <u9012063@gmail.com>
Date:   Fri May 18 19:22:28 2018 -0700

    net: ip6_gre: fix tunnel metadata device sharing.

    Currently ip6gre and ip6erspan share single metadata mode device,
    using 'collect_md_tun'.  Thus, when doing:
      ip link add dev ip6gre11 type ip6gretap external
      ip link add dev ip6erspan12 type ip6erspan external
      RTNETLINK answers: File exists
    simply fails due to the 2nd tries to create the same collect_md_tun.

    The patch fixes it by adding a separate collect md tunnel device
    for the ip6erspan, 'collect_md_tun_erspan'.  As a result, a couple
    of places need to refactor/split up in order to distinguish ip6gre
    and ip6erspan.

    First, move the collect_md check at ip6gre_tunnel_{unlink,link} and
    create separate function {ip6gre,ip6ersapn}_tunnel_{link_md,unlink_md}.
    Then before link/unlink, make sure the link_md/unlink_md is called.
    Finally, a separate ndo_uninit is created for ip6erspan.  Tested it
    using the samples/bpf/test_tunnel_bpf.sh.

Fixes: ef7baf5e083c ("ip6_gre: add ip6 erspan collect_md mode")
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agodatapath: ip6_gre: Fix ip6erspan hlen calculation
William Tu [Tue, 29 May 2018 12:55:04 +0000 (05:55 -0700)]
datapath: ip6_gre: Fix ip6erspan hlen calculation

commit 2d665034f239412927b1e71329f20f001c92da09
Author: Petr Machata <petrm@mellanox.com>
Date:   Thu May 17 16:36:51 2018 +0200

    net: ip6_gre: Fix ip6erspan hlen calculation

    Even though ip6erspan_tap_init() sets up hlen and tun_hlen according to
    what ERSPAN needs, it goes ahead to call ip6gre_tnl_link_config() which
    overwrites these settings with GRE-specific ones.

    Similarly for changelink callbacks, which are handled by
    ip6gre_changelink() calls ip6gre_tnl_change() calls
    ip6gre_tnl_link_config() as well.

    The difference ends up being 12 vs. 20 bytes, and this is generally not
    a problem, because a 12-byte request likely ends up allocating more and
    the extra 8 bytes are thus available. However correct it is not.

    So replace the newlink and changelink callbacks with an ERSPAN-specific
    ones, reusing the newly-introduced _common() functions.

Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agodatapath: ip6_gre: Split up ip6gre_changelink()
William Tu [Tue, 29 May 2018 12:55:03 +0000 (05:55 -0700)]
datapath: ip6_gre: Split up ip6gre_changelink()

commit c8632fc30bb03aa0c3bd7bcce85355a10feb8149
Author: Petr Machata <petrm@mellanox.com>
Date:   Thu May 17 16:36:45 2018 +0200

    net: ip6_gre: Split up ip6gre_changelink()

    Extract from ip6gre_changelink() a reusable function
    ip6gre_changelink_common(). This will allow introduction of
    ERSPAN-specific _changelink() function with not a lot of code
    duplication.

Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agodatapath: ip6_gre: Split up ip6gre_newlink()
William Tu [Tue, 29 May 2018 12:55:02 +0000 (05:55 -0700)]
datapath: ip6_gre: Split up ip6gre_newlink()

commit 7fa38a7c852ec99e3a7fc375eb2c21c50c2e46b8
Author: Petr Machata <petrm@mellanox.com>
Date:   Thu May 17 16:36:39 2018 +0200

    net: ip6_gre: Split up ip6gre_newlink()

    Extract from ip6gre_newlink() a reusable function
    ip6gre_newlink_common(). The ip6gre_tnl_link_config() call needs to be
    made customizable for ERSPAN, thus reorder it with calls to
    ip6_tnl_change_mtu() and dev_hold(), and extract the whole tail to the
    caller, ip6gre_newlink(). Thus enable an ERSPAN-specific _newlink()
    function without a lot of duplicity.

Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agodatapath: ip6_gre: Split up ip6gre_tnl_change()
William Tu [Tue, 29 May 2018 12:55:01 +0000 (05:55 -0700)]
datapath: ip6_gre: Split up ip6gre_tnl_change()

commit a6465350ef495f5cbd76a3e505d25a01d648477e
Author: Petr Machata <petrm@mellanox.com>
Date:   Thu May 17 16:36:33 2018 +0200

    net: ip6_gre: Split up ip6gre_tnl_change()

    Split a reusable function ip6gre_tnl_copy_tnl_parm() from
    ip6gre_tnl_change(). This will allow ERSPAN-specific code to
    reuse the common parts while customizing the behavior for ERSPAN.

Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agodatapath: ip6_gre: Split up ip6gre_tnl_link_config()
William Tu [Tue, 29 May 2018 12:55:00 +0000 (05:55 -0700)]
datapath: ip6_gre: Split up ip6gre_tnl_link_config()

commit a483373ead61e6079bc8ebe27e2dfdb2e3c1559f
Author: Petr Machata <petrm@mellanox.com>
Date:   Thu May 17 16:36:27 2018 +0200

    net: ip6_gre: Split up ip6gre_tnl_link_config()

    The function ip6gre_tnl_link_config() is used for setting up
    configuration of both ip6gretap and ip6erspan tunnels. Split the
    function into the common part and the route-lookup part. The latter then
    takes the calculated header length as an argument. This split will allow
    the patches down the line to sneak in a custom header length computation
    for the ERSPAN tunnel.

Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agodatapath: ip6_gre: Fix headroom request in ip6erspan_tunnel_xmit()
William Tu [Tue, 29 May 2018 12:54:59 +0000 (05:54 -0700)]
datapath: ip6_gre: Fix headroom request in ip6erspan_tunnel_xmit()

commit 5691484df961aff897d824bcc26cd1a2aa036b5b
Author: Petr Machata <petrm@mellanox.com>
Date:   Thu May 17 16:36:15 2018 +0200

    net: ip6_gre: Fix headroom request in ip6erspan_tunnel_xmit()

    dev->needed_headroom is not primed until ip6_tnl_xmit(), so it starts
    out zero. Thus the call to skb_cow_head() fails to actually make sure
    there's enough headroom to push the ERSPAN headers to. That can lead to
    the panic cited below. (Reproducer below that).

    Fix by requesting either needed_headroom if already primed, or just the
    bare minimum needed for the header otherwise.

    [  190.703567] kernel BUG at net/core/skbuff.c:104!
    [  190.708384] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
    [  190.714007] Modules linked in: act_mirred cls_matchall ip6_gre ip6_tunnel tunnel6 gre sch_ingress vrf veth x86_pkg_t
emp_thermal mlx_platform nfsd e1000e leds_mlxcpld
    [  190.728975] CPU: 1 PID: 959 Comm: kworker/1:2 Not tainted 4.17.0-rc4-net_master-custom-139 #10
    [  190.737647] Hardware name: Mellanox Technologies Ltd. "MSN2410-CB2F"/"SA000874", BIOS 4.6.5 03/08/2016
    [  190.747006] Workqueue: ipv6_addrconf addrconf_dad_work
    [  190.752222] RIP: 0010:skb_panic+0xc3/0x100
    [  190.756358] RSP: 0018:ffff8801d54072f0 EFLAGS: 00010282
    [  190.761629] RAX: 0000000000000085 RBX: ffff8801c1a8ecc0 RCX: 0000000000000000
    [  190.768830] RDX: 0000000000000085 RSI: dffffc0000000000 RDI: ffffed003aa80e54
    [  190.776025] RBP: ffff8801bd1ec5a0 R08: ffffed003aabce19 R09: ffffed003aabce19
    [  190.783226] R10: 0000000000000001 R11: ffffed003aabce18 R12: ffff8801bf695dbe
    [  190.790418] R13: 0000000000000084 R14: 00000000000006c0 R15: ffff8801bf695dc8
    [  190.797621] FS:  0000000000000000(0000) GS:ffff8801d5400000(0000) knlGS:0000000000000000
    [  190.805786] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  190.811582] CR2: 000055fa929aced0 CR3: 0000000003228004 CR4: 00000000001606e0
    [  190.818790] Call Trace:
    [  190.821264]  <IRQ>
    [  190.823314]  ? ip6erspan_tunnel_xmit+0x5e4/0x1982 [ip6_gre]
    [  190.828940]  ? ip6erspan_tunnel_xmit+0x5e4/0x1982 [ip6_gre]
    [  190.834562]  skb_push+0x78/0x90
    [  190.837749]  ip6erspan_tunnel_xmit+0x5e4/0x1982 [ip6_gre]
    [  190.843219]  ? ip6gre_tunnel_ioctl+0xd90/0xd90 [ip6_gre]
    [  190.848577]  ? debug_check_no_locks_freed+0x210/0x210
    [  190.853679]  ? debug_check_no_locks_freed+0x210/0x210
    [  190.858783]  ? print_irqtrace_events+0x120/0x120
    [  190.863451]  ? sched_clock_cpu+0x18/0x210
    [  190.867496]  ? cyc2ns_read_end+0x10/0x10
    [  190.871474]  ? skb_network_protocol+0x76/0x200
    [  190.875977]  dev_hard_start_xmit+0x137/0x770
    [  190.880317]  ? do_raw_spin_trylock+0x6d/0xa0
    [  190.884624]  sch_direct_xmit+0x2ef/0x5d0
    [  190.888589]  ? pfifo_fast_dequeue+0x3fa/0x670
    [  190.892994]  ? pfifo_fast_change_tx_queue_len+0x810/0x810
    [  190.898455]  ? __lock_is_held+0xa0/0x160
    [  190.902422]  __qdisc_run+0x39e/0xfc0
    [  190.906041]  ? _raw_spin_unlock+0x29/0x40
    [  190.910090]  ? pfifo_fast_enqueue+0x24b/0x3e0
    [  190.914501]  ? sch_direct_xmit+0x5d0/0x5d0
    [  190.918658]  ? pfifo_fast_dequeue+0x670/0x670
    [  190.923047]  ? __dev_queue_xmit+0x172/0x1770
    [  190.927365]  ? preempt_count_sub+0xf/0xd0
    [  190.931421]  __dev_queue_xmit+0x410/0x1770
    [  190.935553]  ? ___slab_alloc+0x605/0x930
    [  190.939524]  ? print_irqtrace_events+0x120/0x120
    [  190.944186]  ? memcpy+0x34/0x50
    [  190.947364]  ? netdev_pick_tx+0x1c0/0x1c0
    [  190.951428]  ? __skb_clone+0x2fd/0x3d0
    [  190.955218]  ? __copy_skb_header+0x270/0x270
    [  190.959537]  ? rcu_read_lock_sched_held+0x93/0xa0
    [  190.964282]  ? kmem_cache_alloc+0x344/0x4d0
    [  190.968520]  ? cyc2ns_read_end+0x10/0x10
    [  190.972495]  ? skb_clone+0x123/0x230
    [  190.976112]  ? skb_split+0x820/0x820
    [  190.979747]  ? tcf_mirred+0x554/0x930 [act_mirred]
    [  190.984582]  tcf_mirred+0x554/0x930 [act_mirred]
    [  190.989252]  ? tcf_mirred_act_wants_ingress.part.2+0x10/0x10 [act_mirred]
    [  190.996109]  ? __lock_acquire+0x706/0x26e0
    [  191.000239]  ? sched_clock_cpu+0x18/0x210
    [  191.004294]  tcf_action_exec+0xcf/0x2a0
    [  191.008179]  tcf_classify+0xfa/0x340
    [  191.011794]  __netif_receive_skb_core+0x8e1/0x1c60
    [  191.016630]  ? debug_check_no_locks_freed+0x210/0x210
    [  191.021732]  ? nf_ingress+0x500/0x500
    [  191.025458]  ? process_backlog+0x347/0x4b0
    [  191.029619]  ? print_irqtrace_events+0x120/0x120
    [  191.034302]  ? lock_acquire+0xd8/0x320
    [  191.038089]  ? process_backlog+0x1b6/0x4b0
    [  191.042246]  ? process_backlog+0xc2/0x4b0
    [  191.046303]  process_backlog+0xc2/0x4b0
    [  191.050189]  net_rx_action+0x5cc/0x980
    [  191.053991]  ? napi_complete_done+0x2c0/0x2c0
    [  191.058386]  ? mark_lock+0x13d/0xb40
    [  191.062001]  ? clockevents_program_event+0x6b/0x1d0
    [  191.066922]  ? print_irqtrace_events+0x120/0x120
    [  191.071593]  ? __lock_is_held+0xa0/0x160
    [  191.075566]  __do_softirq+0x1d4/0x9d2
    [  191.079282]  ? ip6_finish_output2+0x524/0x1460
    [  191.083771]  do_softirq_own_stack+0x2a/0x40
    [  191.087994]  </IRQ>
    [  191.090130]  do_softirq.part.13+0x38/0x40
    [  191.094178]  __local_bh_enable_ip+0x135/0x190
    [  191.098591]  ip6_finish_output2+0x54d/0x1460
    [  191.102916]  ? ip6_forward_finish+0x2f0/0x2f0
    [  191.107314]  ? ip6_mtu+0x3c/0x2c0
    [  191.110674]  ? ip6_finish_output+0x2f8/0x650
    [  191.114992]  ? ip6_output+0x12a/0x500
    [  191.118696]  ip6_output+0x12a/0x500
    [  191.122223]  ? ip6_route_dev_notify+0x5b0/0x5b0
    [  191.126807]  ? ip6_finish_output+0x650/0x650
    [  191.131120]  ? ip6_fragment+0x1a60/0x1a60
    [  191.135182]  ? icmp6_dst_alloc+0x26e/0x470
    [  191.139317]  mld_sendpack+0x672/0x830
    [  191.143021]  ? igmp6_mcf_seq_next+0x2f0/0x2f0
    [  191.147429]  ? __local_bh_enable_ip+0x77/0x190
    [  191.151913]  ipv6_mc_dad_complete+0x47/0x90
    [  191.156144]  addrconf_dad_completed+0x561/0x720
    [  191.160731]  ? addrconf_rs_timer+0x3a0/0x3a0
    [  191.165036]  ? mark_held_locks+0xc9/0x140
    [  191.169095]  ? __local_bh_enable_ip+0x77/0x190
    [  191.173570]  ? addrconf_dad_work+0x50d/0xa20
    [  191.177886]  ? addrconf_dad_work+0x529/0xa20
    [  191.182194]  addrconf_dad_work+0x529/0xa20
    [  191.186342]  ? addrconf_dad_completed+0x720/0x720
    [  191.191088]  ? __lock_is_held+0xa0/0x160
    [  191.195059]  ? process_one_work+0x45d/0xe20
    [  191.199302]  ? process_one_work+0x51e/0xe20
    [  191.203531]  ? rcu_read_lock_sched_held+0x93/0xa0
    [  191.208279]  process_one_work+0x51e/0xe20
    [  191.212340]  ? pwq_dec_nr_in_flight+0x200/0x200
    [  191.216912]  ? get_lock_stats+0x4b/0xf0
    [  191.220788]  ? preempt_count_sub+0xf/0xd0
    [  191.224844]  ? worker_thread+0x219/0x860
    [  191.228823]  ? do_raw_spin_trylock+0x6d/0xa0
    [  191.233142]  worker_thread+0xeb/0x860
    [  191.236848]  ? process_one_work+0xe20/0xe20
    [  191.241095]  kthread+0x206/0x300
    [  191.244352]  ? process_one_work+0xe20/0xe20
    [  191.248587]  ? kthread_stop+0x570/0x570
    [  191.252459]  ret_from_fork+0x3a/0x50
    [  191.256082] Code: 14 3e ff 8b 4b 78 55 4d 89 f9 41 56 41 55 48 c7 c7 a0 cf db 82 41 54 44 8b 44 24 2c 48 8b 54 24 30 48 8b 74 24 20 e8 16 94 13 ff <0f> 0b 48 c7 c7 60 8e 1f 85 48 83 c4 20 e8 55 ef a6 ff 89 74 24
    [  191.275327] RIP: skb_panic+0xc3/0x100 RSP: ffff8801d54072f0
    [  191.281024] ---[ end trace 7ea51094e099e006 ]---
    [  191.285724] Kernel panic - not syncing: Fatal exception in interrupt
    [  191.292168] Kernel Offset: disabled
    [  191.295697] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

    Reproducer:

        ip link add h1 type veth peer name swp1
        ip link add h3 type veth peer name swp3

        ip link set dev h1 up
        ip address add 192.0.2.1/28 dev h1

        ip link add dev vh3 type vrf table 20
        ip link set dev h3 master vh3
        ip link set dev vh3 up
        ip link set dev h3 up

        ip link set dev swp3 up
        ip address add dev swp3 2001:db8:2::1/64

        ip link set dev swp1 up
        tc qdisc add dev swp1 clsact

        ip link add name gt6 type ip6erspan \
                local 2001:db8:2::1 remote 2001:db8:2::2 oseq okey 123
        ip link set dev gt6 up

        sleep 1

        tc filter add dev swp1 ingress pref 1000 matchall skip_hw \
                action mirred egress mirror dev gt6
        ping -I h1 192.0.2.2

Fixes: e41c7c68ea77 ("ip6erspan: make sure enough headroom at xmit.")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agodatapath: ip6_gre: Request headroom in __gre6_xmit()
William Tu [Tue, 29 May 2018 12:54:58 +0000 (05:54 -0700)]
datapath: ip6_gre: Request headroom in __gre6_xmit()

Upstream commit:
commit 01b8d064d58b4c1f0eff47f8fe8a8508cb3b3840
Author: Petr Machata <petrm@mellanox.com>
Date:   Thu May 17 16:36:10 2018 +0200

net: ip6_gre: Request headroom in __gre6_xmit()

__gre6_xmit() pushes GRE headers before handing over to ip6_tnl_xmit()
for generic IP-in-IP processing. However it doesn't make sure that there
is enough headroom to push the header to. That can lead to the panic
cited below. (Reproducer below that).

Fix by requesting either needed_headroom if already primed, or just the
bare minimum needed for the header otherwise.

[  158.576725] kernel BUG at net/core/skbuff.c:104!
[  158.581510] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
[  158.587174] Modules linked in: act_mirred cls_matchall ip6_gre ip6_tunnel tunnel6 gre sch_ingress vrf veth x86_pkg_t
emp_thermal mlx_platform nfsd e1000e leds_mlxcpld
[  158.602268] CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 4.17.0-rc4-net_master-custom-139 #10
[  158.610938] Hardware name: Mellanox Technologies Ltd. "MSN2410-CB2F"/"SA000874", BIOS 4.6.5 03/08/2016
[  158.620426] RIP: 0010:skb_panic+0xc3/0x100
[  158.624586] RSP: 0018:ffff8801d3f27110 EFLAGS: 00010286
[  158.629882] RAX: 0000000000000082 RBX: ffff8801c02cc040 RCX: 0000000000000000
[  158.637127] RDX: 0000000000000082 RSI: dffffc0000000000 RDI: ffffed003a7e4e18
[  158.644366] RBP: ffff8801bfec8020 R08: ffffed003aabce19 R09: ffffed003aabce19
[  158.651574] R10: 000000000000000b R11: ffffed003aabce18 R12: ffff8801c364de66
[  158.658786] R13: 000000000000002c R14: 00000000000000c0 R15: ffff8801c364de68
[  158.666007] FS:  0000000000000000(0000) GS:ffff8801d5400000(0000) knlGS:0000000000000000
[  158.674212] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  158.680036] CR2: 00007f4b3702dcd0 CR3: 0000000003228002 CR4: 00000000001606e0
[  158.687228] Call Trace:
[  158.689752]  ? __gre6_xmit+0x246/0xd80 [ip6_gre]
[  158.694475]  ? __gre6_xmit+0x246/0xd80 [ip6_gre]
[  158.699141]  skb_push+0x78/0x90
[  158.702344]  __gre6_xmit+0x246/0xd80 [ip6_gre]
[  158.706872]  ip6gre_tunnel_xmit+0x3bc/0x610 [ip6_gre]
[  158.711992]  ? __gre6_xmit+0xd80/0xd80 [ip6_gre]
[  158.716668]  ? debug_check_no_locks_freed+0x210/0x210
[  158.721761]  ? print_irqtrace_events+0x120/0x120
[  158.726461]  ? sched_clock_cpu+0x18/0x210
[  158.730572]  ? sched_clock_cpu+0x18/0x210
[  158.734692]  ? cyc2ns_read_end+0x10/0x10
[  158.738705]  ? skb_network_protocol+0x76/0x200
[  158.743216]  ? netif_skb_features+0x1b2/0x550
[  158.747648]  dev_hard_start_xmit+0x137/0x770
[  158.752010]  sch_direct_xmit+0x2ef/0x5d0
[  158.755992]  ? pfifo_fast_dequeue+0x3fa/0x670
[  158.760460]  ? pfifo_fast_change_tx_queue_len+0x810/0x810
[  158.765975]  ? __lock_is_held+0xa0/0x160
[  158.770002]  __qdisc_run+0x39e/0xfc0
[  158.773673]  ? _raw_spin_unlock+0x29/0x40
[  158.777781]  ? pfifo_fast_enqueue+0x24b/0x3e0
[  158.782191]  ? sch_direct_xmit+0x5d0/0x5d0
[  158.786372]  ? pfifo_fast_dequeue+0x670/0x670
[  158.790818]  ? __dev_queue_xmit+0x172/0x1770
[  158.795195]  ? preempt_count_sub+0xf/0xd0
[  158.799313]  __dev_queue_xmit+0x410/0x1770
[  158.803512]  ? ___slab_alloc+0x605/0x930
[  158.807525]  ? ___slab_alloc+0x605/0x930
[  158.811540]  ? memcpy+0x34/0x50
[  158.814768]  ? netdev_pick_tx+0x1c0/0x1c0
[  158.818895]  ? __skb_clone+0x2fd/0x3d0
[  158.822712]  ? __copy_skb_header+0x270/0x270
[  158.827079]  ? rcu_read_lock_sched_held+0x93/0xa0
[  158.831903]  ? kmem_cache_alloc+0x344/0x4d0
[  158.836199]  ? skb_clone+0x123/0x230
[  158.839869]  ? skb_split+0x820/0x820
[  158.843521]  ? tcf_mirred+0x554/0x930 [act_mirred]
[  158.848407]  tcf_mirred+0x554/0x930 [act_mirred]
[  158.853104]  ? tcf_mirred_act_wants_ingress.part.2+0x10/0x10 [act_mirred]
[  158.860005]  ? __lock_acquire+0x706/0x26e0
[  158.864162]  ? mark_lock+0x13d/0xb40
[  158.867832]  tcf_action_exec+0xcf/0x2a0
[  158.871736]  tcf_classify+0xfa/0x340
[  158.875402]  __netif_receive_skb_core+0x8e1/0x1c60
[  158.880334]  ? nf_ingress+0x500/0x500
[  158.884059]  ? process_backlog+0x347/0x4b0
[  158.888241]  ? lock_acquire+0xd8/0x320
[  158.892050]  ? process_backlog+0x1b6/0x4b0
[  158.896228]  ? process_backlog+0xc2/0x4b0
[  158.900291]  process_backlog+0xc2/0x4b0
[  158.904210]  net_rx_action+0x5cc/0x980
[  158.908047]  ? napi_complete_done+0x2c0/0x2c0
[  158.912525]  ? rcu_read_unlock+0x80/0x80
[  158.916534]  ? __lock_is_held+0x34/0x160
[  158.920541]  __do_softirq+0x1d4/0x9d2
[  158.924308]  ? trace_event_raw_event_irq_handler_exit+0x140/0x140
[  158.930515]  run_ksoftirqd+0x1d/0x40
[  158.934152]  smpboot_thread_fn+0x32b/0x690
[  158.938299]  ? sort_range+0x20/0x20
[  158.941842]  ? preempt_count_sub+0xf/0xd0
[  158.945940]  ? schedule+0x5b/0x140
[  158.949412]  kthread+0x206/0x300
[  158.952689]  ? sort_range+0x20/0x20
[  158.956249]  ? kthread_stop+0x570/0x570
[  158.960164]  ret_from_fork+0x3a/0x50
[  158.963823] Code: 14 3e ff 8b 4b 78 55 4d 89 f9 41 56 41 55 48 c7 c7 a0 cf db 82 41 54 44 8b 44 24 2c 48 8b 54 24 30 48 8b 74 24 20 e8 16 94 13 ff <0f> 0b 48 c7 c7 60 8e 1f 85 48 83 c4 20 e8 55 ef a6 ff 89 74 24
[  158.983235] RIP: skb_panic+0xc3/0x100 RSP: ffff8801d3f27110
[  158.988935] ---[ end trace 5af56ee845aa6cc8 ]---
[  158.993641] Kernel panic - not syncing: Fatal exception in interrupt
[  159.000176] Kernel Offset: disabled
[  159.003767] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

Reproducer:

ip link add h1 type veth peer name swp1
ip link add h3 type veth peer name swp3

ip link set dev h1 up
ip address add 192.0.2.1/28 dev h1

ip link add dev vh3 type vrf table 20
ip link set dev h3 master vh3
ip link set dev vh3 up
ip link set dev h3 up

ip link set dev swp3 up
ip address add dev swp3 2001:db8:2::1/64

ip link set dev swp1 up
tc qdisc add dev swp1 clsact

ip link add name gt6 type ip6gretap \
local 2001:db8:2::1 remote 2001:db8:2::2
ip link set dev gt6 up

sleep 1

tc filter add dev swp1 ingress pref 1000 matchall skip_hw \
action mirred egress mirror dev gt6
ping -I h1 192.0.2.2

Fixes: c12b395a4664 ("gre: Support GRE over IPv6")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
6 years agoofproto-dpif: Use dp_hash as default selection method
Jan Scheurich [Thu, 24 May 2018 15:28:01 +0000 (17:28 +0200)]
ofproto-dpif: Use dp_hash as default selection method

The dp_hash selection method for select groups overcomes the scalability
problems of the current default selection method which, due to L2-L4
hashing during xlation and un-wildcarding of the hashed fields,
basically requires an upcall to the slow path to load-balance every
L4 connection. The consequence are an explosion of datapath flows
(megaflows degenerate to miniflows) and a limitation of connection
setup rate OVS can handle.

This commit changes the default selection method to dp_hash, provided the
bucket configuration is such that the dp_hash method can accurately
represent the bucket weights with up to 64 hash values. Otherwise we
stick to original default hash method.

We use the new dp_hash algorithm OVS_HASH_L4_SYMMETRIC to maintain the
symmetry property of the old default hash method.

A controller can explicitly request the old default hash selection method
by specifying selection method "hash" with an empty list of fields in the
Group properties of the OpenFlow 1.5 Group Mod message.

Update the documentation about selection method in the ovs-ovctl man page.

Revise and complete the ofproto-dpif unit tests cases for select groups.

Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Co-authored-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoofproto-dpif: Improve dp_hash selection method for select groups
Jan Scheurich [Thu, 24 May 2018 15:28:00 +0000 (17:28 +0200)]
ofproto-dpif: Improve dp_hash selection method for select groups

The current implementation of the "dp_hash" selection method suffers
from two deficiences: 1. The hash mask and hence the number of dp_hash
values is just large enough to cover the number of group buckets, but
does not consider the case that buckets have different weights. 2. The
xlate-time selection of best bucket from the masked dp_hash value often
results in bucket load distributions that are quite different from the
bucket weights because the number of available masked dp_hash values
is too small (2-6 bits compared to 32 bits of a full hash in the default
hash selection method).

This commit provides a more accurate implementation of the dp_hash
select group by applying the well known Webster method for distributing
a small number of "seats" fairly over the weighted "parties"
(see https://en.wikipedia.org/wiki/Webster/Sainte-Lagu%C3%AB_method).
The dp_hash mask is autmatically chosen large enough to provide good
enough accuracy even with widely differing weights.

This distribution happens at group modification time and the resulting
table is stored with the group-dpif struct. At xlation time, we use the
masked dp_hash values as index to look up the assigned bucket.

If the bucket should not be live, we do a circular search over the
mapping table until we find the first live bucket. As the buckets in
the table are by construction in pseudo-random order with a frequency
according to their weight, this method maintains correct distribution
even if one or more buckets are non-live.

Xlation is further simplified by storing some derived select group state
at group construction in struct group-dpif in a form better suited for
xlation purposes.

Adapted the unit test case for dp_hash select group accordingly.

Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Co-authored-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agouserspace datapath: Add OVS_HASH_L4_SYMMETRIC dp_hash algorithm
Jan Scheurich [Thu, 24 May 2018 15:27:59 +0000 (17:27 +0200)]
userspace datapath: Add OVS_HASH_L4_SYMMETRIC dp_hash algorithm

This commit implements a new dp_hash algorithm OVS_HASH_L4_SYMMETRIC in
the netdev datapath. It will be used as default hash algorithm for the
dp_hash-based select groups in a subsequent commit to maintain
compatibility with the symmetry property of the current default hash
selection method.

A new dpif_backer_support field 'max_hash_alg' is introduced to reflect
the highest hash algorithm a datapath supports in the dp_hash action.

Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Co-authored-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agovlog: exit with error if explicitly specified logfile cannot be opened
Dan Williams [Fri, 25 May 2018 17:49:59 +0000 (12:49 -0500)]
vlog: exit with error if explicitly specified logfile cannot be opened

It seems like if the user wanted a specific logfile but that request
cannot be fulfilled, OVS/OVN shouldn't just continue as if nothing
really happened (besides logging a warning).

Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agovconn: Remove obsolete comment.
Ben Pfaff [Wed, 23 May 2018 23:39:56 +0000 (16:39 -0700)]
vconn: Remove obsolete comment.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agotests: Avoid printing Python exception for hosts without IPv6 support.
Ben Pfaff [Wed, 23 May 2018 22:15:36 +0000 (15:15 -0700)]
tests: Avoid printing Python exception for hosts without IPv6 support.

The tests probe whether the host has IPv6 support and, if it doesn't, skip
the tests that require IPv6.  However, until now, when the host lacks
support, this caused a Python exception to be printed, like this:

Traceback (most recent call last):
  File "<string>", line 3, in <module>
  File "/usr/lib64/python2.7/socket.py", line 187, in __init__
    _sock = _realsocket(family, type, proto)
socket.error: [Errno 97] Address family not supported by protocol

This exception is expected and harmless, but it reasonably surprised some
users.  This commit fixes the problem.

Reported-by: Paul Greenberg
Reported-at: https://github.com/openvswitch/ovs-issues/issues/146#issuecomment-390081887
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovs-sim: Support backup and clustered databases for ovn.
Ben Pfaff [Thu, 17 May 2018 21:20:15 +0000 (14:20 -0700)]
ovs-sim: Support backup and clustered databases for ovn.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovs-vsctl, ovn-nbctl, ovn-sbctl, vtep-ctl: Parse options before logging.
Ben Pfaff [Thu, 17 May 2018 21:04:25 +0000 (14:04 -0700)]
ovs-vsctl, ovn-nbctl, ovn-sbctl, vtep-ctl: Parse options before logging.

These utilities logged the command very early, before parsing the options
or the command.  This meant that logging options (like --log-file or
-vsyslog:off) weren't considered for the purpose of logging the command.
This fixes the problem.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovs-sim, ovs-sandbox: Turn off logging to syslog.
Ben Pfaff [Thu, 17 May 2018 20:53:44 +0000 (13:53 -0700)]
ovs-sim, ovs-sandbox: Turn off logging to syslog.

There's no value in having these testing tools log to syslog.  It just
pollutes the system log.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovs-sim: Install RST manpages into simulation environment too.
Ben Pfaff [Thu, 17 May 2018 19:41:23 +0000 (12:41 -0700)]
ovs-sim: Install RST manpages into simulation environment too.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovs-sandbox: Add option to support multiple ovn-controllers.
Ben Pfaff [Fri, 25 May 2018 21:24:18 +0000 (14:24 -0700)]
ovs-sandbox: Add option to support multiple ovn-controllers.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovs-sim: Convert documentation to RST format.
Ben Pfaff [Thu, 17 May 2018 19:31:45 +0000 (12:31 -0700)]
ovs-sim: Convert documentation to RST format.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovsdb: Improve timing in cluster torture test.
Ben Pfaff [Thu, 17 May 2018 18:06:01 +0000 (11:06 -0700)]
ovsdb: Improve timing in cluster torture test.

Until now the timing in the cluster torture test has been pretty
inaccurate because it just worked by calling "sleep 1" in a loop that
did other things.  The longer those other things took, the more
inaccurate it got.

This commit changes to using a separate process for timing.  It still won't
be all that accurate but at least the timing loop doesn't try to do
anything else.

(I'm not sure how to actually get accurate timing in shell.)

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovsdb: Improve torture test for clusters.
Ben Pfaff [Thu, 17 May 2018 17:41:51 +0000 (10:41 -0700)]
ovsdb: Improve torture test for clusters.

This test is supposed to be parameterized, but one of the loops didn't
honor the parameterization and just had hardcoded values.  Also, the
output comparison didn't work properly for more than 100 client sets
(n1 > 100), so this adds some explicit sorting to the mix.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovn: Update TODO.
Ben Pfaff [Thu, 24 May 2018 18:00:35 +0000 (11:00 -0700)]
ovn: Update TODO.

We've actually made a lot of improvements.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agocompati/ip_gre: remove duplicate vport definition.
William Tu [Fri, 25 May 2018 13:28:49 +0000 (06:28 -0700)]
compati/ip_gre: remove duplicate vport definition.

Clean up the duplicate definition of OVS_VPORT_TYPE_ERSPAN
since it is defined in openvswitch.h.

Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoodp-util: refector erspan option parsing.
William Tu [Fri, 25 May 2018 13:28:48 +0000 (06:28 -0700)]
odp-util: refector erspan option parsing.

Instead of memcpy to a local stack, parse the erspan
metadata in memory.

Suggested-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoerspan: update NEWS and FAQ.
William Tu [Fri, 25 May 2018 16:47:20 +0000 (09:47 -0700)]
erspan: update NEWS and FAQ.

Update Documentation/faq/configuration.rst about ERSPAN
and Update NEWS.

Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoEmbrace anonymous unions.
Ben Pfaff [Thu, 24 May 2018 17:32:59 +0000 (10:32 -0700)]
Embrace anonymous unions.

Several OVS structs contain embedded named unions, like this:

struct {
    ...
    union {
        ...
    } u;
};

C11 standardized a feature that many compilers already implemented
anyway, where an embedded union may be unnamed, like this:

struct {
    ...
    union {
        ...
    };
};

This is more convenient because it allows the programmer to omit "u."
in many places.  OVS already used this feature in several places.  This
commit embraces it in several others.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
Tested-by: Alin Gabriel Serdean <aserdean@ovn.org>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
6 years agoovs-fields: Improve formatting of NSH section.
Ben Pfaff [Fri, 18 May 2018 17:16:41 +0000 (10:16 -0700)]
ovs-fields: Improve formatting of NSH section.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoovs-fields: Correct ideas about which OXM classes are official.
Ben Pfaff [Fri, 18 May 2018 17:16:40 +0000 (10:16 -0700)]
ovs-fields: Correct ideas about which OXM classes are official.

The purpose of including an OpenFlow version in the notes in meta-flow.h
and ovs-fields.7 is to explain what version of OpenFlow standardized a
given field.  NXOXM_* are not standardized so they should not have an
OpenFlow version.  This commit corrects it.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoerspan: add NXOXM_ET_ERSPAN_ field tests.
William Tu [Fri, 25 May 2018 15:03:18 +0000 (08:03 -0700)]
erspan: add NXOXM_ET_ERSPAN_ field tests.

ERSPAN is the first real-world use cases of Experimenter OXM,
which introduces 4 new NXOXM_ET_ fields (ver, idx, dir, hwid).
The patch adds test cases for these fields.

At the same time, delete the special case for NXOXM_ET_DP_HASH,
because it was only in there for testing anyway.

Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoovn pacemaker: Fix promotion issue when the master node is reset
Numan Siddique [Thu, 17 May 2018 10:04:09 +0000 (15:34 +0530)]
ovn pacemaker: Fix promotion issue when the master node is reset

When a node 'A' in the pacemaker cluster running OVN db servers in
master is brought down ungracefully ('echo b > /proc/sysrq_trigger'
for example), pacemaker is not able to promote any other node to
master in the cluster. When pacemaker selects a node B for instance to
promote, it moves the IPAddr2 resource (i.e the master ip) to node
'B'. As soon the node is configured with the IP address, when the
issue is seen, the OVN db servers which were running as standy
earlier, transitions to active. Ideally this should not have happened.
The ovsdb-servers are expected to remain in standby until there are
promoted. (This needs separate investigation). When the pacemaker
calls the OVN OCF script's promote action, the ovsdb_server_promot
function returns almost immediately without recording the present
master. And later in the notify action it demotes back the OVN db
servers since the last known master doesn't match with node 'B's
hostname. This results in pacemaker promoting/demoting in a loop.

This patch fixes the issue by not returning immediately when promote
action is called if the OVN db servers are running as active. Now it
would continue with the ovsdb_server_promot function and records the
new master by setting proper master score ($CRM_MASTER -N $host_name
-v ${master_score})

This issue is not seen when a node is brought down gracefully as
pacemaker before promoting a node, calls stop, start and then promote
actions. Not sure why pacemaker doesn't call stop, start and promote
actions when a node is reset ungracefully.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1579025
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Russell Bryant <russell@ovn.org>
6 years agodpdk: reflect status and version in the database
Aaron Conole [Thu, 3 May 2018 19:08:01 +0000 (15:08 -0400)]
dpdk: reflect status and version in the database

The normal way of retrieving the running DPDK status involves parsing
log files and issuing various incantations of ovs-vsctl and ovs-appctl
commands to determine whether the rte_eal_init successfully started.

This commit adds two new records to reflect the dpdk version, and
the dpdk initialization status.

To support this, the other_config:dpdk-init configuration block supports
the 'true' and 'try' keywords now, instead of just 'true'.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
6 years agodpdk: allow init to fail
Aaron Conole [Thu, 3 May 2018 19:08:00 +0000 (15:08 -0400)]
dpdk: allow init to fail

It's possible for dpdk initialization to fail either due to an internal
error or an invalid configuration.  When that happens, it's rather
impolite to immediately abort without any details.

With this change, a failed dpdk initialization attempt will continue to
trigger a SIGABRT.  However, the failure details will be logged, and a
user or administrator may have more information to correct the issue.
A restart of OvS would still be required to re-attempt initialization.

The refactor to propagate the init error will be used in an upcoming
commit.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
6 years agodpif-netdev: Free packets on TUNNEL_PUSH if should_steal.
Ilya Maximets [Thu, 24 May 2018 09:51:21 +0000 (12:51 +0300)]
dpif-netdev: Free packets on TUNNEL_PUSH if should_steal.

Unconditional return may cause packet leak in case of
'should_steal == true'.

Additionally, removed redundant checking for depth level.

CC: Sugesh Chandran <sugesh.chandran@intel.com>
Fixes: 7c12dfc527a5 ("tunneling: Avoid datapath-recirc by
                      combining recirc actions at xlate.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokess <ian.stokes@intel.com>
6 years agonetdev-dpdk: fix check for "net_nfp" driver
Timothy Redaelli [Thu, 17 May 2018 16:45:01 +0000 (18:45 +0200)]
netdev-dpdk: fix check for "net_nfp" driver

Currently the check of "net_nfp" driver while enabling scatter compares
only the first 6 bytes, but "net_nfp" is 7 bytes long.

This change fixes the check by comparing the first 7 bytes.

CC: Pablo Cascón <pablo.cascon@netronome.com>
CC: Simon Horman <simon.horman@netronome.com>
Fixes: 65a87968f4cf ("netdev-dpdk: don't enable scatter for jumbo RX support for nfp")
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Acked-by: Pablo Cascón <pablo.cascon@netronome.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
6 years agonetdev-dpdk: Don't use PMD driver if not configured successfully
Eelco Chaudron [Wed, 16 May 2018 14:15:34 +0000 (16:15 +0200)]
netdev-dpdk: Don't use PMD driver if not configured successfully

When initialization of the DPDK PMD driver fails
(dpdk_eth_dev_init()), the reconfigure_datapath() function will remove
the port from dp_netdev, and the port is not used.

Now when bridge_reconfigure() is called again, no changes to the
previous failing netdev configuration are detected and therefore the
ports gets added to dp_netdev and used uninitialized. This is causing
exceptions...

The fix has two parts to it. First in netdev-dpdk.c we remember if the
DPDK port was started or not, and when calling
netdev_dpdk_reconfigure() we also try re-initialization if the port
was not already active. The second part of the change is in
dpif-netdev.c where it makes sure netdev_reconfigure() is called if
the port needs reconfiguration, as netdev_is_reconf_required() is only
true until netdev_reconfigure() is called (even if it fails).

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Tested-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
6 years agonetdev-dpdk: Remove use of rte_mempool_ops_get_count.
Kevin Traynor [Wed, 23 May 2018 13:41:30 +0000 (14:41 +0100)]
netdev-dpdk: Remove use of rte_mempool_ops_get_count.

rte_mempool_ops_get_count is not exported by DPDK so it means it
cannot be used by OVS when using DPDK as a shared library.

Remove rte_mempool_ops_get_count but still use rte_mempool_full
and document it's behavior.

Fixes: 91fccdad72a2 ("netdev-dpdk: Free mempool only when no in-use mbufs.")
Reported-by: Timothy Redaelli <tredaelli@redhat.com>
Reported-by: Markos Chandras <mchandras@suse.de>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
6 years agoExtend tests for conjunctive match support in OVN
Numan Siddique [Thu, 24 May 2018 15:45:53 +0000 (17:45 +0200)]
Extend tests for conjunctive match support in OVN

Check the application of conjunctive matching to logical flow match
expressions. In particular cover the case where conjunctive matching is
applied to ACL match expressions that refer to Address Sets.

Mark Michelson who tested a similar patch [1] has found a significant
improvement in ACL processing and reduction of OF flows from an order of
1 million to few thousands. [2]

Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
[1] - https://mail.openvswitch.org/pipermail/ovs-dev/2018-February/344523.html
[2] - https://mail.openvswitch.org/pipermail/ovs-dev/2018-February/344311.html

Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoFactor prerequisites out of AND/OR trees with unique symbol
Jakub Sitnicki [Thu, 24 May 2018 15:45:52 +0000 (17:45 +0200)]
Factor prerequisites out of AND/OR trees with unique symbol

Appending prerequisites to sub-expressions of OR that are all over one
symbol prevents the expression-to-matches converter from applying
conjunctive matching. This happens during the annotation phase.

input:      s1 == { c1, c2 } && s2 == { c3, c4 }
expanded:   (s1 == c1 || s1 == c2) && (s2 == c3 || s2 == c4)
annotated:  ((p1 && s1 == c1) || (p1 && s1 == c2)) &&
            ((p2 && s2 == c3) || (p2 && s2 == c4))
normalized: (p1 && p2 && s1 == c1 && s2 == c3) ||
            (p1 && p2 && s1 == c1 && s2 == c4) ||
            (p1 && p2 && s1 == c2 && s2 == c3) ||
            (p1 && p2 && s1 == c2 && s2 == c4)

Where s1,s2 - symbols, c1..c4 - constants, p1,p2 - prerequisites.

Since sub-expressions of OR trees that are over one symbol all have the
same prerequisites, we can factor them out leaving the OR tree in tact,
and enabling the converter to apply conjunctive matching to
AND(OR(clause)) trees.

Going back to our example this change gives us:

input:      s1 == { c1, c2 } && s2 == { c3, c4 }
expanded:   (s1 == c1 || s1 == c2) && (s2 == c3 || s2 == c4)
annotated:  (s1 == c1 || s1 == c2) && p1 && (s2 == c3 || s2 == c4) && p2
normalized: p1 && p2 && (s1 == c1 || s1 == c2) && (s2 == c3 || s2 == c4)

We also factor out the prerequisites out of pure AND or mixed AND/OR
trees to keep the common code path, but in this case the only thing we
gain is a shorter expression as prerequisites for each symbol appear
only once.

Documentation comments have been contributed by Ben Pfaff.

Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agonetdev-native-tnl: Fix alignment for erspan index.
Darrell Ball [Thu, 24 May 2018 02:13:56 +0000 (19:13 -0700)]
netdev-native-tnl: Fix alignment for erspan index.

Flagged by clang.

CC: William Tu <u9012063@gmail.com>
Fixes: 068794b43f0e ("erspan: Add flow-based erspan options")
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoofp-flow: Fix uninitialized data decoding OF1.5 flow stats.
Ben Pfaff [Wed, 23 May 2018 20:51:59 +0000 (13:51 -0700)]
ofp-flow: Fix uninitialized data decoding OF1.5 flow stats.

Reported-by: Paul Greenberg
Reported-at: https://github.com/openvswitch/ovs-issues/issues/149
Fixes: c7b02b800615 ("Add support for OpenFlow 1.5 statistics (OXS).")
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Darrell Ball <dlu998@gmail.com>
6 years agorconn: Introduce new invariant to fix assertion failure in corner case.
Ben Pfaff [Wed, 23 May 2018 23:58:31 +0000 (16:58 -0700)]
rconn: Introduce new invariant to fix assertion failure in corner case.

Until now, rconn_get_version() has only reported the OpenFlow version in
use when the rconn is actually connected.  This makes sense, but it has a
harsh consequence.  Consider code like this:

    if (rconn_is_connected(rconn) && rconn_get_version(rconn) >= 0) {
        for (int i = 0; i < 2; i++) {
            struct ofpbuf *b = ofputil_encode_echo_request(
                rconn_get_version(rconn));
            rconn_send(rconn, b, NULL);
        }
    }

Maybe not the smartest code in the world, and probably no one would write
this exact code in any case, but it doesn't look too risky or crazy.

But it is.  The second trip through the loop can assert-fail inside
ofputil_encode_echo_request() because rconn_get_version(rconn) returns -1
instead of a valid OpenFlow version.  That happens if the first call to
rconn_send() encounters an error while sending the message and therefore
destroys the underlying vconn and disconnects so that rconn_get_version()
doesn't have a vconn to query for its version.

In a case like this where all the code to send the messages is close by, we
could just check rconn_get_version() in each loop iteration.  We could even
go through the tree and convince ourselves that individual bits of code are
safe, or be conservative and check rconn_get_version() >= 0 in the iffy
cases.  But this seems to me like an ongoing source of risk and a way to
get things wrong in corner cases.

This commit takes a different approach.  It introduces a new invariant: if
an rconn has ever been connected, then it returns a valid OpenFlow version
from rconn_get_version().  In addition, if an rconn is currently connected,
then the OpenFlow version it returns is the correct one (that may be
obvious, but there were corner cases before where it returned -1 even
though rconn_is_connected() returned true).

With this commit, the code above would work OK.  If the first call to
rconn_send() encounters an error sending the message, then
rconn_get_version() in the second iteration will return the same value as
in the first iteration.  The message passed to rconn_send() will end up
being discarded, but that's much better than either an assertion failure or
having to carefully analyze a lot of our code to deal with one unusual
corner case.

Reported-by: Han Zhou <zhouhan@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Han Zhou <hzhou8@ebay.com>
6 years agodatapath: compat: Fix ndo_size in RHEL 7.5 backport
Yi-Hung Wei [Thu, 17 May 2018 19:39:51 +0000 (12:39 -0700)]
datapath: compat: Fix ndo_size in RHEL 7.5 backport

If 'ndo_size' is not set in 'struct net_device_ops', RHEL kernel will not
make use of functions in 'struct net_device_ops_extended'.

Fixes: 39ca338374ab ("datapath: compat: Fix build on RHEL 7.5")
Reported-by: Jiri Benc <jbenc@redhat.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2018-May/347070.html
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Jiri Benc <jbenc@redhat.com>
6 years agoovn-controller: Count calls to lflow_run()
Jakub Sitnicki [Fri, 18 May 2018 16:55:35 +0000 (18:55 +0200)]
ovn-controller: Count calls to lflow_run()

lflow_run() is the main logical flows processing routine that we spend
most of the CPU time in when testing at scale.

With the switch to incremental processing approach in the controller,
we will be trying to avoid calling to lflow_run() as much as possible.

A counter lets us confirm that we are doing logical flow processing
only when it's expected, without resorting to profiling under stress.

It can also serve as a hint as to why ovn-controller process is
consuming CPU time.

Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Han Zhou <hzhou8@ebay.com>
6 years agorhel: Use openvswitch user/group for the log directory
Timothy Redaelli [Wed, 23 May 2018 13:46:32 +0000 (15:46 +0200)]
rhel: Use openvswitch user/group for the log directory

Commit 94cd8383e297 ("rhel: fix log directory permissions") restored the
old 755 permission on /var/log/openvswitch and this can result in the
exposure of sensitive information.

Since commit f624bf23b62a ("rhel: user/group openvswitch does not exist")
moved the user/group creations in %pre phase it's now possible to change
/var/log/openvswitch user/group to openvswitch:openvswitch and remove
the r/x bits for other again without having the "permission denied"
error when the logs are rotated.

CC: Aaron Conole <aconole@redhat.com>
Fixes: 94cd8383e297 ("rhel: fix log directory permissions")
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Markos Chandras <mchandras@suse.de>
6 years agoovn.at: fix occasional failure - ACL reject rule test
Han Zhou [Sat, 19 May 2018 21:21:33 +0000 (14:21 -0700)]
ovn.at: fix occasional failure - ACL reject rule test

The test fails occasionally because it may starts sending packets
before the new ACL related flows are installed on HVs, even if it
ensures lflows exist in SB DB. This patch ensure the HVs are in
sync by ovn-nbctl --wait=hv sync, and removes the check for lflow
readiness in SB.

Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoodp-execute: Rename 'may_steal' to 'should_steal'.
Darrell Ball [Thu, 17 May 2018 02:24:46 +0000 (19:24 -0700)]
odp-execute: Rename 'may_steal' to 'should_steal'.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoodp-execute: Correct and clarify 'steal' parameter.
Darrell Ball [Thu, 17 May 2018 06:08:48 +0000 (23:08 -0700)]
odp-execute: Correct and clarify 'steal' parameter.

Correct and clarify 'steal'/'may_steal' comments in
odp_execute_actions().

Reported-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agotests: Make test result more predictable.
Darrell Ball [Fri, 18 May 2018 16:52:23 +0000 (09:52 -0700)]
tests: Make test result more predictable.

The test 'ofproto-dpif - in place modification (vlan)' fails often
due to miss handling. Hence, make it more predictable by specifying
that misses should just be dropped.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoerspan: fix invalid erspan version.
William Tu [Thu, 17 May 2018 20:36:47 +0000 (13:36 -0700)]
erspan: fix invalid erspan version.

ERSPAN only support version 1 and 2.  When packets send to an erspan device
which does not have proper version number set, drop the packet.  In real
case, we observe multicast packets sent to the erspan pernet device,
erspan0, which does not have erspan version configured.

Without this patch, we observe warning message from ovs-vswitchd as below,
due to receive an malformed erspan packet:

odp_util|WARN|odp_tun_key_from_attr__ invalid erspan version

Reported-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agogre: Resolve gre receive issues
Greg Rose [Thu, 17 May 2018 17:43:53 +0000 (10:43 -0700)]
gre: Resolve gre receive issues

On newer Linux kernels or on older kernels such as Red Hat that backport
from newer upstream Linux kernel releases the built-in gre kernel module
will interfere with OVS gre code in the receive path.  Fix this up by
placing the gre kernel code within the openvswitch driver so it will
not have to depend on the built-in gre kernel module.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agorhel: Enable ERSPAN features for RHEL 7.x
Greg Rose [Wed, 16 May 2018 20:13:20 +0000 (13:13 -0700)]
rhel: Enable ERSPAN features for RHEL 7.x

Enable ERSPAN on RHEL 7.x

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoerspan: set bso when truncated bit is set.
William Tu [Sun, 13 May 2018 14:09:41 +0000 (07:09 -0700)]
erspan: set bso when truncated bit is set.

Before the patch, the erspan BSO bit (Bad/Short/Oversized) is not
handled.  BSO has 4 possible values:
  00 --> Good frame with no error, or unknown integrity
  11 --> Payload is a Bad Frame with CRC or Alignment Error
  01 --> Payload is a Short Frame
  10 --> Payload is an Oversized Frame

This patch set BSO to 01 when truncate is true.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoerspan: auto detect truncated ipv6 packets.
William Tu [Sun, 13 May 2018 14:03:49 +0000 (07:03 -0700)]
erspan: auto detect truncated ipv6 packets.

Upstream commit:
    commit d5db21a3e6977dcb42cee3d16cd69901fa66510a
    Author: William Tu <u9012063@gmail.com>
    Date:   Fri May 11 05:49:47 2018 -0700

    erspan: auto detect truncated ipv6 packets.

    Currently the truncated bit is set only when 1) the mirrored packet
    is larger than mtu and 2) the ipv4 packet tot_len is larger than
    the actual skb->len.  This patch adds another case for detecting
    whether ipv6 packet is truncated or not, by checking the ipv6 header
    payload_len and the skb->len.

Reported-by: Xiaoyan Jin <xiaoyanj@vmware.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: William Tu <u9012063@gmail.com>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agoip6erspan: make sure enough headroom at xmit.
William Tu [Fri, 11 May 2018 03:03:31 +0000 (20:03 -0700)]
ip6erspan: make sure enough headroom at xmit.

Upstream commit:
    commit e41c7c68ea771683cae5a7f81c268f38d7912ecb
    Author: William Tu <u9012063@gmail.com>
    Date:   Fri Mar 9 07:34:42 2018 -0800

    ip6erspan: make sure enough headroom at xmit.

    The patch adds skb_cow_header() to ensure enough headroom
    at ip6erspan_tunnel_xmit before pushing the erspan header
    to the skb.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoip6erspan: improve error handling for erspan version number.
William Tu [Fri, 11 May 2018 02:51:09 +0000 (19:51 -0700)]
ip6erspan: improve error handling for erspan version number.

Upstream commit:
    commit d6aa71197ffcb68850bfebfc3fc160abe41df53b
    Author: William Tu <u9012063@gmail.com>
    Date:   Fri Mar 9 07:34:41 2018 -0800

    ip6erspan: improve error handling for erspan version number.

    When users fill in incorrect erspan version number through
    the struct erspan_metadata uapi, current code skips pushing
    the erspan header but continue pushing the gre header, which
    is incorrect.  The patch fixes it by returning error.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoip6gre: add erspan v2 to tunnel lookup
William Tu [Fri, 11 May 2018 02:45:44 +0000 (19:45 -0700)]
ip6gre: add erspan v2 to tunnel lookup

Upstream commit:
    commit 3b04caab81649a9e8d5375b919b6653d791951df
    Author: William Tu <u9012063@gmail.com>
    Date:   Fri Mar 9 07:34:40 2018 -0800

    ip6gre: add erspan v2 to tunnel lookup

    The patch adds the erspan v2 proto in ip6gre_tunnel_lookup
    so the erspan v2 tunnel can be found correctly.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoerspan: Add flow-based erspan options
Greg Rose [Fri, 18 May 2018 00:46:41 +0000 (17:46 -0700)]
erspan: Add flow-based erspan options

The patch add supports for flow-based erspan options.
The erspan_ver, erspan_idx, erspan_dir, and erspan_hwid can be
set as "flow" so that its value is set by the openflow rule,
instead of statically configured at port creation time.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agolib/dpif-netlink: Fix miscompare of gre ports
Greg Rose [Fri, 4 May 2018 23:48:43 +0000 (16:48 -0700)]
lib/dpif-netlink: Fix miscompare of gre ports

In netdev_to_ovs_vport_type() it checks for netdev types matching
"gre" with a strstr().  This makes it match ip6gre as well and return
OVS_VPORT_TYPE_GRE, which is clearly wrong.

Move the usage of strstr() *after* all the exact matches with strcmp()
to avoid the problem permanently because when I added the ip6gre
type I ran into a very difficult to detect bug.

Cc: Ben Pfaff <blp@ovn.org>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoip6gre: Add ip6gre vport type
Greg Rose [Fri, 4 May 2018 17:14:44 +0000 (10:14 -0700)]
ip6gre: Add ip6gre vport type

Add handlers for OVS_VPORT_TYPE_IP6GRE

Cc: Ben Pfaff <blp@ovn.org>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoerspan: auto detect truncated packets.
William Tu [Wed, 2 May 2018 22:06:04 +0000 (15:06 -0700)]
erspan: auto detect truncated packets.

Upstream commit:
    commit 1baf5ebf8954d9bff8fa4e7dd6c416a0cebdb9e2
    Author: William Tu <u9012063@gmail.com>
    Date:   Fri Apr 27 14:16:32 2018 -0700

    erspan: auto detect truncated packets.

    Currently the truncated bit is set only when the mirrored packet
    is larger than mtu.  For certain cases, the packet might already
    been truncated before sending to the erspan tunnel.  In this case,
    the patch detect whether the IP header's total length is larger
    than the actual skb->len.  If true, this indicated that the
    mirrored packet is truncated and set the erspan truncate bit.

    I tested the patch using bpf_skb_change_tail helper function to
    shrink the packet size and send to erspan tunnel.

Reported-by: Xiaoyan Jin <xiaoyanj@vmware.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoopenvswitch: fix vport packet length check.
William Tu [Wed, 2 May 2018 21:45:39 +0000 (14:45 -0700)]
openvswitch: fix vport packet length check.

Upstream commit:
    commit 46e371f0e78a82186a83cbcb4f4b8850417c7dd5
    Author: William Tu <u9012063@gmail.com>
    Date:   Wed Mar 7 15:38:48 2018 -0800

    openvswitch: fix vport packet length check.

    When sending a packet to a tunnel device, the dev's hard_header_len
    could be larger than the skb->len in function packet_length().
    In the case of ip6gretap/erspan, hard_header_len = LL_MAX_HEADER + t_hlen,
    which is around 180, and an ARP packet sent to this tunnel has
    skb->len = 42.  This causes the 'unsign int length' to become super
    large because it is negative value, causing the later ovs_vport_send
    to drop it due to over-mtu size.  The patch fixes it by setting it to 0.

Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoerspan: add kernel datapath support
William Tu [Wed, 21 Mar 2018 21:02:25 +0000 (14:02 -0700)]
erspan: add kernel datapath support

pass check, check-kernel (4.16-rc4), check-system-userspace

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agouserspace: add erspan tunnel support.
William Tu [Tue, 15 May 2018 20:10:48 +0000 (16:10 -0400)]
userspace: add erspan tunnel support.

ERSPAN is a tunneling protocol based on GRE tunnel.  The patch
add erspan tunnel support for ovs-vswitchd with userspace datapath.
Configuring erspan tunnel is similar to gre tunnel, but with
additional erspan's parameters.  Matching a flow on erspan's
metadata is also supported, see ovs-fields for more details.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agouserspace: add gre sequence number support.
William Tu [Tue, 15 May 2018 20:10:49 +0000 (16:10 -0400)]
userspace: add gre sequence number support.

The patch adds support for gre sequence number.
Default is disable.  When enable with 'options:seq=true',
the outgoing gre packet will have its sequence number
incremented by one.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agonetdev-native-tnl: refactor the tunnel push header.
William Tu [Fri, 9 Mar 2018 21:02:23 +0000 (13:02 -0800)]
netdev-native-tnl: refactor the tunnel push header.

The patch adds additional 'struct netdev *' to the
native tunnel's push_header() interface.  This is used
for later GRE sequence number support.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
6 years agodatapath: add erspan version I and II support
William Tu [Mon, 12 Mar 2018 18:28:08 +0000 (11:28 -0700)]
datapath: add erspan version I and II support

Upstream commit:
    commit fc1372f89ffe1f58b589643b75f679e452350703
    Author: William Tu <u9012063@gmail.com>
    Date:   Thu Jan 25 13:20:11 2018 -0800

    openvswitch: add erspan version I and II support

    The patch adds support for openvswitch to configure erspan
    v1 and v2.  The OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS attr is added
    to uapi as a binary blob to support all ERSPAN v1 and v2's
    fields.  Note that Previous commit "openvswitch: Add erspan tunnel
    support." was reverted since it does not design properly.

Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agocompat: erspan: use bitfield instead of mask and offset
William Tu [Fri, 9 Mar 2018 21:52:32 +0000 (13:52 -0800)]
compat: erspan: use bitfield instead of mask and offset

Upstream commit:
    commit c69de58ba84f480879de64571d9dae5102d10ed6
    Author: William Tu <u9012063@gmail.com>
    Date:   Thu Jan 25 13:20:09 2018 -0800

    net: erspan: use bitfield instead of mask and offset

    Originally the erspan fields are defined as a group into a __be16 field,
    and use mask and offset to access each field.  This is more costly due to
    calling ntohs/htons.  The patch changes it to use bitfields.

Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Folds in the ip_gre portions of this commit.  Other portions of this
commit are included in a previous patch where it is called out.

Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agodatapath: erspan: introduce erspan v2 for ip_gre
William Tu [Fri, 9 Mar 2018 19:03:19 +0000 (11:03 -0800)]
datapath: erspan: introduce erspan v2 for ip_gre

Upstream commit:

    commit f551c91de262ba36b20c3ac19538afb4f4507441
    Author: William Tu <u9012063@gmail.com>
    Date:   Wed Dec 13 16:38:56 2017 -0800

    net: erspan: introduce erspan v2 for ip_gre

    The patch adds support for erspan version 2.  Not all features are
    supported in this patch.  The SGT (security group tag), GRA (timestamp
    granularity), FT (frame type) are set to fixed value.  Only hardware
    ID and direction are configurable.  Optional subheader is also not
    supported.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Includes some compatability layer adjustments and portions of this
commit were introduced earlier while pulling in ipv6 erspan.

Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agodatapath: Use correct tunnel receive for ip6gre
Greg Rose [Wed, 9 May 2018 22:57:10 +0000 (15:57 -0700)]
datapath: Use correct tunnel receive for ip6gre

During backports of ip6 gre I used ovs_ip_tunnel_rcv() for the
ip6gre_rcv() function but that is wrong because it processes ipv4
tunnels.  Use the correct backported ip6 tunnel receive in ip6
tunnel.c ip6_tnl_rcv().

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agodatapath: Add dellink op to ip6gre and ip6erspan tap ops
Greg Rose [Wed, 9 May 2018 22:04:45 +0000 (15:04 -0700)]
datapath: Add dellink op to ip6gre and ip6erspan tap ops

Fix an oversight in the ip6gre_tap_ops and ip6erspan_tap_ops in
which the .dellink field was not initialized leading to bugs
when trying to remove and re-add those type of ports.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agocompat: Add ipv6 GRE and IPV6 Tunneling
Greg Rose [Mon, 5 Mar 2018 18:11:57 +0000 (10:11 -0800)]
compat: Add ipv6 GRE and IPV6 Tunneling

This patch backports upstream ipv6 GRE and tunneling into the OVS
OOT (Out of Tree) datapath drivers.  The primary reason for this
is to support the ERSPAN feature.

Because there is no previous history of ipv6 GRE and tunneling it is
not possible to exactly reproduce the history of all the files in
the patch.  The two newly added files - ip6_gre.c and ip6_tunnel.c -
are cut from whole cloth out of the upstream Linux 4.15 kernel and
then modified as necessary with compatibility layer fixups.
These two files already included parts of several other upstream
commits that also touched other upstream files.  As such, this
patch may incorporate parts or all of the following commits:

d350a82 net: erspan: create erspan metadata uapi header
c69de58 net: erspan: use bitfield instead of mask and offset
b423d13 net: erspan: fix use-after-free
214bb1c net: erspan: remove md NULL check
afb4c97 ip6_gre: fix potential memory leak in ip6erspan_rcv
50670b6 ip_gre: fix potential memory leak in erspan_rcv
a734321 ip6_gre: fix error path when ip6erspan_rcv failed
dd8d5b8 ip_gre: fix error path when erspan_rcv failed
293a199 ip6_gre: fix a pontential issue in ip6erspan_rcv
d91e8db5 net: erspan: reload pointer after pskb_may_pull
ae3e133 net: erspan: fix wrong return value
c05fad5 ip_gre: fix wrong return value of erspan_rcv
94d7d8f ip6_gre: add erspan v2 support
f551c91 net: erspan: introduce erspan v2 for ip_gre
1d7e2ed net: erspan: refactor existing erspan code
ef7baf5 ip6_gre: add ip6 erspan collect_md mode
5a963eb ip6_gre: Add ERSPAN native tunnel support
ceaa001 openvswitch: Add erspan tunnel support.
f192970 ip_gre: check packet length and mtu correctly in erspan tx
c84bed4 ip_gre: erspan device should keep dst
c122fda ip_gre: set tunnel hlen properly in erspan_tunnel_init
5513d08 ip_gre: check packet length and mtu correctly in erspan_xmit
935a974 ip_gre: get key from session_id correctly in erspan_rcv
1a66a83 gre: add collect_md mode to ERSPAN tunnel
84e54fe gre: introduce native tunnel support for ERSPAN

In cases where the listed commits also touched other source code
files then the patches are also listed separately within this
patch series.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agocompat: Fixups for some compile warnings and errors
Greg Rose [Mon, 5 Mar 2018 18:09:10 +0000 (10:09 -0800)]
compat: Fixups for some compile warnings and errors

A lot of code has been pulled in.  Fix it up to make sure it compiles
correctly.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agocompat: Add #define for gre_handle_offloads
Greg Rose [Tue, 27 Feb 2018 18:14:17 +0000 (10:14 -0800)]
compat: Add #define for gre_handle_offloads

Fixes compile errors on some 4.x kernels.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agocompat: Move function to header
Greg Rose [Tue, 27 Feb 2018 16:36:28 +0000 (08:36 -0800)]
compat: Move function to header

tnl_flags_to_gre_flags is also needed in both ip_gre.c and gre.c on
some kernels.  Move it from ip_gre.c to the common header.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoip_gre: remove the incorrect mtu limit for ipgre tap
Xin Long [Fri, 2 Mar 2018 21:53:29 +0000 (13:53 -0800)]
ip_gre: remove the incorrect mtu limit for ipgre tap

Upstream commit:
    commit cfddd4c33c254954927942599d299b3865743146
    Author: Xin Long <lucien.xin@gmail.com>
    Date:   Mon Dec 18 14:24:35 2017 +0800

    ip_gre: remove the incorrect mtu limit for ipgre tap

    ipgre tap driver calls ether_setup(), after commit 61e84623ace3
    ("net: centralize net_device min/max MTU checking"), the range
    of mtu is [min_mtu, max_mtu], which is [68, 1500] by default.

    It causes the dev mtu of the ipgre tap device to not be greater
    than 1500, this limit value is not correct for ipgre tap device.

    Besides, it's .change_mtu already does the right check. So this
    patch is just to set max_mtu as 0, and leave the check to it's
    .change_mtu.

Fixes: 61e84623ace3 ("net: centralize net_device min/max MTU checking")
Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoip_gre: erspan: reload pointer after pskb_may_pull
William Tu [Fri, 2 Mar 2018 19:06:52 +0000 (11:06 -0800)]
ip_gre: erspan: reload pointer after pskb_may_pull

Upstream commit:
    commit d91e8db5b629a3c8c81db4dc317a66c7b5591821
    Author: William Tu <u9012063@gmail.com>
    Date:   Fri Dec 15 14:27:44 2017 -0800

    net: erspan: reload pointer after pskb_may_pull

    pskb_may_pull() can change skb->data, so we need to re-load pkt_md
    and ershdr at the right place.

Fixes: 94d7d8f29287 ("ip6_gre: add erspan v2 support")
Fixes: f551c91de262 ("net: erspan: introduce erspan v2 for ip_gre")
Signed-off-by: William Tu <u9012063@gmail.com>
Cc: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Only the ip_gre portion of the upstream commit.  The ipv6 portion
is pulled in with later patch in series.

Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoip_gre: fix wrong return value of erspan_rcv
Haishuang Yan [Fri, 2 Mar 2018 18:42:21 +0000 (10:42 -0800)]
ip_gre: fix wrong return value of erspan_rcv

Upstream commit:
    commit c05fad5713b81b049ec6ac4eb2d304030b1efdce
    Author: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
    Date:   Fri Dec 15 10:46:16 2017 +0800

    ip_gre: fix wrong return value of erspan_rcv

    If pskb_may_pull return failed, return PACKET_REJECT instead of -ENOMEM.

Fixes: 84e54fe0a5ea ("gre: introduce native tunnel support for ERSPAN")
Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agocompat/erspan: refactor existing erspan code
William Tu [Fri, 2 Mar 2018 18:12:21 +0000 (10:12 -0800)]
compat/erspan: refactor existing erspan code

Upstream commit:
    commit 1d7e2ed22f8d9171fa8b629754022f22115b3f03
    Author: William Tu <u9012063@gmail.com>
    Date:   Wed Dec 13 16:38:55 2017 -0800

    net: erspan: refactor existing erspan code

    The patch refactors the existing erspan implementation in order
    to support erspan version 2, which has additional metadata.  So, in
    stead of having one 'struct erspanhdr' holding erspan version 1,
    breaks it into 'struct erspan_base_hdr' and 'struct erspan_metadata'.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Partial of the upstream commit.  While doing backports it is pretty
much impossible to fully reconstitute all upstream commits but we're
doing our best.  Other parts of this commit are introduced in the
upcoming monster patch for ip6 gre support.

Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoip_gre: Refactor the erspan tunnel code.
William Tu [Fri, 23 Feb 2018 16:45:13 +0000 (08:45 -0800)]
ip_gre: Refactor the erspan tunnel code.

Upstream commit:
    commit a3222dc95ca751cdc5f6ac3c9b092b160b73ed9f
    Author: William Tu <u9012063@gmail.com>
    Date:   Thu Nov 30 11:51:27 2017 -0800

    ip_gre: Refector the erpsan tunnel code.

    Move two erspan functions to header file, erspan.h, so ipv6
    erspan implementation can use it.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoip_gre: erspan device should keep dst
Xin Long [Fri, 2 Mar 2018 17:26:50 +0000 (09:26 -0800)]
ip_gre: erspan device should keep dst

Upstream commit:
    commit c84bed440e4e11a973e8c0254d0dfaccfca41fb0
    Author: Xin Long <lucien.xin@gmail.com>
    Date:   Sun Oct 1 22:00:56 2017 +0800

    ip_gre: erspan device should keep dst

    The patch 'ip_gre: ipgre_tap device should keep dst' fixed
    the issue ipgre_tap dev mtu couldn't be updated in tx path.

    The same fix is needed for erspan as well.

Fixes: 84e54fe0a5ea ("gre: introduce native tunnel support for ERSPAN")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoip_gre: set tunnel hlen properly in erspan_tunnel_init
Xin Long [Fri, 2 Mar 2018 17:21:43 +0000 (09:21 -0800)]
ip_gre: set tunnel hlen properly in erspan_tunnel_init

Upstream commit:
    commit c122fda271717f4fc618e0a31e833941fd5f1efd
    Author: Xin Long <lucien.xin@gmail.com>
    Date:   Sun Oct 1 22:00:55 2017 +0800

    ip_gre: set tunnel hlen properly in erspan_tunnel_init

    According to __gre_tunnel_init, tunnel->hlen should be set as the
    headers' length between inner packet and outer iphdr.

    It would be used especially to calculate a proper mtu when updating
    mtu in tnl_update_pmtu. Now without setting it, a bigger mtu value
    than expected would be updated, which hurts performance a lot.

    This patch is to fix it by setting tunnel->hlen with:
       tunnel->tun_hlen + tunnel->encap_hlen + sizeof(struct erspanhdr)

Fixes: 84e54fe0a5ea ("gre: introduce native tunnel support for ERSPAN")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoip_gre: get key from session_id correctly in erspan_rcv
Xin Long [Fri, 2 Mar 2018 17:15:38 +0000 (09:15 -0800)]
ip_gre: get key from session_id correctly in erspan_rcv

Upstream commit:
    commit 935a9749a36828af0e8be224a5cd4bc758112c34
    Author: Xin Long <lucien.xin@gmail.com>
    Date:   Sun Oct 1 22:00:53 2017 +0800

    ip_gre: get key from session_id correctly in erspan_rcv

    erspan only uses the first 10 bits of session_id as the key to look
    up the tunnel. But in erspan_rcv, it missed 'session_id & ID_MASK'
    when getting the key from session_id.

    If any other flag is also set in session_id in a packet, it would
    fail to find the tunnel due to incorrect key in erspan_rcv.

    This patch is to add 'session_id & ID_MASK' there and also remove
    the unnecessary variable session_id.

Fixes: 84e54fe0a5ea ("gre: introduce native tunnel support for ERSPAN")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoip_gre: check packet length and mtu correctly in erspan tx
William Tu [Fri, 23 Feb 2018 00:50:36 +0000 (16:50 -0800)]
ip_gre: check packet length and mtu correctly in erspan tx

Upstream commit:
    commit f192970de860d3ab90aa9e2a22853201a57bde78
    Author: William Tu <u9012063@gmail.com>
    Date:   Thu Oct 5 12:07:12 2017 -0700

    ip_gre: check packet length and mtu correctly in erspan tx

    Similarly to early patch for erspan_xmit(), the ARPHDR_ETHER device
    is the length of the whole ether packet.  So skb->len should subtract
    the dev->hard_header_len.

Fixes: 1a66a836da63 ("gre: add collect_md mode to ERSPAN tunnel")
Fixes: 84e54fe0a5ea ("gre: introduce native tunnel support for ERSPAN")
Signed-off-by: William Tu <u9012063@gmail.com>
Cc: Xin Long <lucien.xin@gmail.com>
Cc: David Laight <David.Laight@aculab.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agocompat/gre: add collect_md mode
William Tu [Thu, 22 Feb 2018 23:37:35 +0000 (15:37 -0800)]
compat/gre: add collect_md mode

    commit 1a66a836da630cd70f3639208da549b549ce576b
    Author: William Tu <u9012063@gmail.com>
    Date:   Fri Aug 25 09:21:28 2017 -0700

    gre: add collect_md mode to ERSPAN tunnel

    Similar to gre, vxlan, geneve, ipip tunnels, allow ERSPAN tunnels to
    operate in 'collect metadata' mode.  bpf_skb_[gs]et_tunnel_key() helpers
    can make use of it right away.  OVS can use it as well in the future.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With some adjustments for compatibility layer.

Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agogre: refactor the gre_fb_xmit
William Tu [Thu, 22 Feb 2018 22:24:28 +0000 (14:24 -0800)]
gre: refactor the gre_fb_xmit

Upstream commit:
    commit 862a03c35ed76c50a562f7406ad23315f7862642
    Author: William Tu <u9012063@gmail.com>
    Date:   Fri Aug 25 09:21:27 2017 -0700

    gre: refactor the gre_fb_xmit

    The patch refactors the gre_fb_xmit function, by creating
    prepare_fb_xmit function for later ERSPAN collect_md mode patch.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Only the prepare_fb_xmit() function is pulled in.  Compatibility
issues prevent the refactor of gre_fb_xmit() but we need the
prepare_fb_xmit() function for the subsequent patch.

Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agogre: fix goto statement typo
William Tu [Thu, 22 Feb 2018 20:52:51 +0000 (12:52 -0800)]
gre: fix goto statement typo

Upstream commit:
    commit e3d0328c76dde0b957f62f8c407b79f1d8fe3ef8
    Author: William Tu <u9012063@gmail.com>
    Date:   Tue Aug 22 17:04:05 2017 -0700

    gre: fix goto statement typo

    Fix typo: pnet_tap_faied.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agogre: introduce native tunnel support for ERSPAN
William Tu [Thu, 22 Feb 2018 20:28:02 +0000 (12:28 -0800)]
gre: introduce native tunnel support for ERSPAN

Upstream commit:
    commit 84e54fe0a5eaed696dee4019c396f8396f5a908b
    Author: William Tu <u9012063@gmail.com>
    Date:   Tue Aug 22 09:40:28 2017 -0700

    gre: introduce native tunnel support for ERSPAN

    The patch adds ERSPAN type II tunnel support.  The implementation
    is based on the draft at [1].  One of the purposes is for Linux
    box to be able to receive ERSPAN monitoring traffic sent from
    the Cisco switch, by creating a ERSPAN tunnel device.
    In addition, the patch also adds ERSPAN TX, so Linux virtual
    switch can redirect monitored traffic to the ERSPAN tunnel device.
    The traffic will be encapsulated into ERSPAN and sent out.

    The implementation reuses tunnel key as ERSPAN session ID, and
    field 'erspan' as ERSPAN Index fields:
    ./ip link add dev ers11 type erspan seq key 100 erspan 123 \
     local 172.16.1.200 remote 172.16.1.100

    To use the above device as ERSPAN receiver, configure
    Nexus 5000 switch as below:

    monitor session 100 type erspan-source
      erspan-id 123
      vrf default
      destination ip 172.16.1.200
      source interface Ethernet1/11 both
      source interface Ethernet1/12 both
      no shut
    monitor erspan origin ip-address 172.16.1.100 global

    [1] https://tools.ietf.org/html/draft-foschiano-erspan-01
    [2] iproute2 patch: http://marc.info/?l=linux-netdev&m=150306086924951&w=2
    [3] test script: http://marc.info/?l=linux-netdev&m=150231021807304&w=2

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Meenakshi Vohra <mvohra@vmware.com>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit also backports heavily from upstream gre, ip_gre and
ip_tunnel modules to support the necessary erspan ip gre
infrastructure as well as implementing a variety of compatability
layer changes for same support.

Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agocompat: Remove unsupported kernel compat code
Greg Rose [Sat, 24 Feb 2018 22:04:31 +0000 (14:04 -0800)]
compat: Remove unsupported kernel compat code

Anything less than 3.10 isn't supported since a couple of releases ago
so remove the dead code.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
6 years agoovsdb: Use new ovsdb_log_write_and_free().
Justin Pettit [Thu, 17 May 2018 17:58:47 +0000 (10:58 -0700)]
ovsdb: Use new ovsdb_log_write_and_free().

Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
6 years agoofproto-dpif-xlate: Improve tracing through groups.
Ben Pfaff [Thu, 10 May 2018 23:21:50 +0000 (16:21 -0700)]
ofproto-dpif-xlate: Improve tracing through groups.

This makes it clear which buckets from a group are executed and why.

The update to nsh.at provides an example.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoofp-print: Handle statistics more systematically.
Ben Pfaff [Thu, 10 May 2018 20:21:29 +0000 (13:21 -0700)]
ofp-print: Handle statistics more systematically.

ofp_to_string__() is supposed to call ofp_print_stats() for all kinds of
statistics, but it was only doing so haphazardly.  This commit makes it
systematic and in the process adds it to at least one case where it was
missing (and fixes up a test case).

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
6 years agoofp-group: Move formatting code for groups into ofp-group.
Ben Pfaff [Thu, 10 May 2018 20:07:45 +0000 (13:07 -0700)]
ofp-group: Move formatting code for groups into ofp-group.

This does a better job of putting related code together.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>