git.proxmox.com Git - ovs.git/log

Update changelog for 2.15.0~git20210104.def6eb1ea+dfsg1-1 release

Realign armhf test skip range due to upstream changes

Use new --with-dpdk=shared configure flag value

Bump libopenvswitch to 2.14

Gbp-Dch: ignore

Refresh py3-compat.patch for v2.15

Update taken from 2.15.0~git20210104.def6eb1ea-0ubuntu3

Bump dependency on libbpdk-dev >= 20.11

Merge branch 'master-dfsg' into 2.15

Merge commit 'def6eb1ea' into master-dfsg

Open vSwitch version 2.15.0~git20210104.def6eb1ea

Generating postinst at build time to avoid using dpkg-architecture at runtime.

Fix installing OVS alternatives on non-amd64 arch (Closes: #979366).

Releasing to unstable.

ACK removing Ben Pfaff from uploaders.

security.rst: Add more information about the Downstream mailing list.

Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>

AUTHORS: Add Renat Nurgaliyev.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

tunnel: Bareudp Tunnel Support.

There are various L3 encapsulation standards using UDP being discussed to
leverage the UDP based load balancing capability of different networks.
MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among them.

The Bareudp tunnel provides a generic L3 encapsulation support for
tunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDP
tunnel.

An example to create bareudp device to tunnel MPLS traffic is
given

$ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \
             type=bareudp options:remote_ip=2.1.1.3
             options:local_ip=2.1.1.2 \
             options:payload_type=0x8847 options:dst_port=6635

The bareudp device supports special handling for MPLS & IP as
they can have multiple ethertypes. MPLS procotcol can have ethertypes
ETH_P_MPLS_UC (unicast) & ETH_P_MPLS_MC (multicast). IP protocol can have
ethertypes ETH_P_IP (v4) & ETH_P_IPV6 (v6).

The bareudp device to tunnel L3 traffic with multiple ethertypes
(MPLS & IP) can be created by passing the L3 protocol name as string in
the field payload_type. An example to create bareudp device to tunnel
MPLS unicast & multicast traffic is given below.::

$ ovs-vsctl add-port  br_mpls udp_port -- set interface
            udp_port \
            type=bareudp options:remote_ip=2.1.1.3
            options:local_ip=2.1.1.2 \
            options:payload_type=mpls options:dst_port=6635

Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Acked-By: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

odp-util: Fix netlink message overflow with userdata.

Too big userdata could overflow netlink message leading to out-of-bound
memory accesses or assertion while formatting nested actions.

Fix that by checking the size and returning correct error code.

Credit to OSS-Fuzz.

Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=27640
Fixes: e995e3df57ea ("Allow OVS_USERSPACE_ATTR_USERDATA to be variable length.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>

dpif-netlink: Fix issues of the offloaded flows counter.

The n_offloaded_flows counter is saved in dpif, and this is the first
one when ofproto is created. When flow operation is done by ovs-appctl
commands, such as, dpctl/add-flow, a new dpif is opened, and the
n_offloaded_flows in it can't be used. So, instead of using counter,
the number of offloaded flows is queried from each netdev, then sum
them up. To achieve this, a new API is added in netdev_flow_api to get
how many flows assigned to a netdev.

In order to get better performance, this number is calculated directly
from tc_to_ufid hmap for netdev-offload-tc, because flow dumping by tc
takes much time if there are many flows offloaded.

Fixes: af0618470507 ("dpif-netlink: Count the number of offloaded rules")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

Update tutorial for newer versions of Faucet and Open vSwitch.

Newer versions of Faucet use a dynamic OpenFlow pipeline based on what
features are enabled in the configuration file. Update log output, flow
table dumps and explanations to be consistent with newer Faucet versions.

Remove mentions of bugs that we have since fixed in Faucet since the
tutorial was originally written.

Adds documentation on changes to Open vSwitch commands to recommend
using a version that is compatible with the features of the tutorial.

Reported-by: Matthias Ableidinger <ableimat@gmx.at>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2018-August/047180.html
Signed-off-by: Brad Cowie <brad@wand.net.nz>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

NEWS: Move '--offload-stats' entry to correct release.

Patch landed to 2.13, not 2.12.

Fixes: 164413156cf9 ("Add offload packets statistics")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>

ovsdb-tool: Fix datum leak in the show-log command.

Fixes: 4e92542cefb7 ("ovsdb-tool: Make "show-log" convert raw JSON to easier-to-read syntax.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Dumitru Ceara <dceara@redhat.com>

test-stream: Silence memory leak report.

AddressSanitizer reports this as a leak.
Let's just free the memory before exiting to avoid the noise.

'stream_close()' doesn't update the pointer, so this will not
change the return value.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Paolo Valerio <pvalerio@redhat.com>

raft: Add some debugging information to cluster/status command.

Introduce the following info useful for cluster debugging to
cluster/status command:
- time elapsed from last start/complete election
- election trigger (e.g. timeout)
- number of disconnections
- time elapsed from last raft messaged received

Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

conntrack: add generic IP protocol support

Currently, userspace conntrack only tracks TCP, UDP, and ICMP, and all
other IP protocols are discarded, and the +inv state is returned. This
is not in line with the kernel conntrack. Where if no L4 information can
be extracted it's treated as generic L3. The change below mimics the
behavior of the kernel.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ofproto-dpif-xlate: Stop forwarding MLD reports to group ports.

According with rfc4541 section 2.1.1, a snooping switch
should forward membership reports only to ports with
routers attached.The current code violates the RFC
forwarding membership reports to group ports as well.
The same issue doesn't exist with IPv4.

Fixes: 06994f879c ("mcast-snooping: Add Multicast Listener Discovery support")
Signed-off-by: XiaoXiong Ding <dingxiaoxiong@huawei.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

github: Fix Ubuntu package installation.

Before trying to install a package, APT cache must be updated to avoid
asking for an unavailable version of a package.

Fixes: 6cb2f5a630e3 ("github: Add GitHub Actions workflow.")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ovsdb-idl: Add comment.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>

ovsdb-idl: Improve prototypes.

Adding parameter names makes these prototypes clearer.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>

ovsdb-idl: Remove prototype for function that is not defined or used.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>

ovsdb-idl: Fix memory leak sending messages without a session.

When there's no open session, we still have to free the messages that
we make but cannot send.

I'm not confident that these fix actual bugs, because it seems possible
that these code paths can only be hit when the session is nonnull.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>

ovsdb-idl: Avoid redundant clearing and parsing of received data.

ovsdb_idl_db_parse_monitor_reply() clears the IDL and parses the
received data. There's no need to do it again afterward.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Fixes: 1b1d2e6daa56 ("ovsdb: Introduce experimental support for clustered databases.")
Acked-by: Ilya Maximets <i.maximets@ovn.org>

jsonrpc: Avoid disconnecting prematurely due to long poll intervals.

Open vSwitch has a few different jsonrpc-based protocols that depend on
jsonrpc_session to make sure that the connection is up and working.
In turn, jsonrpc_session uses the "reconnect" state machine to send
probes if nothing is received.  This works fine in normal circumstances.
In unusual circumstances, though, it can happen that the program is
busy and doesn't even try to receive anything for a long time.  Then the
timer can time out without a good reason; if it had tried to receive
something, it would have.

There's a solution that the clients of jsonrpc_session could adopt.
Instead of first calling jsonrpc_session_run(), which is what calls into
"reconnect" to deal with timing out, and then calling into
jsonrpc_session_recv(), which is what tries to receive something, they
could use the opposite order.  That would make sure that the timeout
was always based on a recent attempt to receive something.  Great.

The actual code in OVS that uses jsonrpc_session, though, tends to use
the opposite order, and there are enough users and this is a subtle
enough issue that it could get flipped back around even if we fixed it
now.  So this commit takes a different approach.  Instead of fixing
this in the users of jsonrpc_session, we fix it in the users of
reconnect: make them tell when they've tried to receive something (or
disable this particular feature).

This commit fixes the problem that way.  It's kind of hard to reproduce
but I'm pretty sure that I've seen it a number of times in testing.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>

dpdk: Update to use DPDK v20.11.

This commit adds support for DPDK v20.11, it includes the following
changes.

1. travis: Remove explicit DPDK kmods configuration.
2. sparse: Fix build with 20.05 DPDK tracepoints.
3. netdev-dpdk: Remove experimental API flag.

   http://patchwork.ozlabs.org/project/openvswitch/list/?series=173216&state=*

4. sparse: Update to DPDK 20.05 trace point header.

   http://patchwork.ozlabs.org/project/openvswitch/list/?series=179604&state=*

5. sparse: Fix build with DPDK 20.08.

   http://patchwork.ozlabs.org/project/openvswitch/list/?series=200181&state=*

6. build: Add support for DPDK meson build.

   http://patchwork.ozlabs.org/project/openvswitch/list/?series=199138&state=*

7. netdev-dpdk: Remove usage of RTE_ETH_DEV_CLOSE_REMOVE flag.

   http://patchwork.ozlabs.org/project/openvswitch/list/?series=207850&state=*

8. netdev-dpdk: Fix build with 20.11-rc1.

   http://patchwork.ozlabs.org/project/openvswitch/list/?series=209006&state=*

9. sparse: Fix __ATOMIC_* redefinition errors

   http://patchwork.ozlabs.org/project/openvswitch/list/?series=209452&state=*

10. build: Remove DPDK make build references.

   http://patchwork.ozlabs.org/project/openvswitch/list/?series=216682&state=*

For credit all authors of the original commits to 'dpdk-latest' with the
above changes have been added as co-authors for this commit.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Co-authored-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Sunil Pai G <sunil.pai.g@intel.com>
Co-authored-by: Sunil Pai G <sunil.pai.g@intel.com>
Signed-off-by: Eli Britstein <elibr@nvidia.com>
Co-authored-by: Eli Britstein <elibr@nvidia.com>
Tested-by: Harry van Haaren <harry.van.haaren@intel.com>
Tested-by: Govindharajan, Hariprasad <hariprasad.govindharajan@intel.com>
Tested-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>

dpif-netlink: Count the number of offloaded rules

Add a counter for the offloaded rules, and display it in the command
of "ovs-appctl upcall/show".

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>

ovsdb-idl: Fix expected condition seqno when changes are pending.

Commit 17f22fe46142 tried to address this but only covered some of the
cases.

The correct way to report the expected seqno is to take into account if
there already is a condition change that was requested to the server but
not acked yet. In that case, the new condition change request will be
sent only after the already requested one is acked. That is, expected
condition seqno when conditions are up to date is db->cond_seqno + 2 in
this case.

Fixes: 17f22fe46142 ("ovsdb-idl: Return correct seqno from ovsdb_idl_db_set_condition().")
Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ovsdb-cluster.at: Fix infinite loop in torture tests.

For some reason, while running cluster torture tests in GitHub Actions
workflow, failure of 'echo' command doesn't fail the loop and subshell
never exits, but keeps infinitely printing errors after breaking from
the loop on the right side of the pipeline:

testsuite: line 8591: echo: write error: Broken pipe

Presumably, that is caused by some shell configuration option, but
I have no idea which one and I'm not able to reproduce locally with
shell configuration options provided in GitHub documentation.
Let's just add an explicit 'exit' on 'echo' failure. This will
guarantee exit from the loop and the subshell regardless of
configuration.

Fixes: 0f03ae3754ec ("ovsdb: Improve timing in cluster torture test.")
Acked-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

lib/tc: fix parse act pedit for tos rewrite

Check overlap between current pedit key, which is always 4 bytes
(range [off, off + 3]), and a map entry in flower_pedit_map
sf = ROUND_DOWN(mf, 4) (range [sf|mf, (mf + sz - 1)|ef]).

So for the tos the rewite the off + 3(3) is greater than mf,
and should less than ef(4) but not mf+sz(2).

Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Simon Horman <simon.horman@netronome.com>

ovsdb-idl: Fix use-after-free when deleting orphaned rows.

It's possible that the IDL client processes multiple jsonrpc updates
in a single ovsdb_idl_run().

Considering the following updates processed in a single IDL run:
1. Update row R1 from table A while R1 is also referenced by row R2 from
   table B:
   - this adds R1 to table A's track_list.
2. Delete row R1 from table A while R1 is also referenced by row R2 from
   table B:
   - because row R2 still refers to row R1, this will create an orphan
     R1.
   - at this point R1 is still in table A's hmap.

When the IDL client calls ovsdb_idl_track_clear() after it has finished
processing the tracked changes, row R1 gets freed leaving a dangling
pointer in table A's hmap.

To fix this we don't free rows in ovsdb_idl_track_clear() if they are
orphan and still referenced by other rows, i.e., the row's 'dst_arcs'
list is not empty.  Later, when all arc sources (e.g., R2) are
deleted, the orphan R1 will be cleaned up as well.

The only exception is when the whole contents of the IDL are flushed,
in ovsdb_idl_db_clear(), in which case it's safe to free all rows.

Reported-by: Ilya Maximets <i.maximets@ovn.org>
Fixes: 932104f483ef ("ovsdb-idl: Add support for change tracking.")
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ovsdb-idl: Fix memleak when deleting orphan rows.

Pure IDL orphan rows, i.e., for which no "insert" operation was seen,
which are part of tables with change tracking enabled should also be
freed when the table track_list is flushed.

Reported-by: Ilya Maximets <i.maximets@ovn.org>
Fixes: 72aeb243a52a ("ovsdb-idl: Tracking - preserve data for deleted rows.")
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ovsdb-idl: Fix memleak when reinserting tracked orphan rows.

Considering the following updates processed by an IDL client:
1. Delete row R1 from table A while R1 is also referenced by row R2 from
   table B:
   - because row R2 still refers to row R1, this will create an orphan
     R1 but also sets row->tracked_old_datum to report to the IDL client
     that the row has been deleted.
2. Insert row R1 to table A.
   - because orphan R1 already existed in the IDL, it will be reused.
   - R1 still has row->tracked_old_datum set (and may also be on the
     table->track_list).
3. Delete row R2 from table B and row R1 from table A.
   - row->tracked_old_datum is set again but the previous
     tracked_old_datum was never freed.

IDL clients use the deleted old_datum values so when multiple delete
operations are received for a row, always track the first one as that
will match the contents of the row the IDL client knew about.

Running the newly added test case with valgrind, without the fix,
produces the following report:

==23113== 327 (240 direct, 87 indirect) bytes in 1 blocks are definitely lost in loss record 43 of 43
==23113==    at 0x4C29F73: malloc (vg_replace_malloc.c:309)
==23113==    by 0x476761: xmalloc (util.c:138)
==23113==    by 0x45D8B3: ovsdb_idl_insert_row (ovsdb-idl.c:3431)
==23113==    by 0x45B7F9: ovsdb_idl_process_update2 (ovsdb-idl.c:2670)
==23113==    by 0x45AFCF: ovsdb_idl_db_parse_update__ (ovsdb-idl.c:2479)
==23113==    by 0x45B262: ovsdb_idl_db_parse_update (ovsdb-idl.c:2542)
==23113==    by 0x45ABBE: ovsdb_idl_db_parse_update_rpc (ovsdb-idl.c:2358)
==23113==    by 0x4576DD: ovsdb_idl_process_msg (ovsdb-idl.c:865)
==23113==    by 0x457973: ovsdb_idl_run (ovsdb-idl.c:944)
==23113==    by 0x40B7B9: do_idl (test-ovsdb.c:2523)
==23113==    by 0x44425D: ovs_cmdl_run_command__ (command-line.c:247)
==23113==    by 0x44430E: ovs_cmdl_run_command (command-line.c:278)
==23113==    by 0x404BA6: main (test-ovsdb.c:76)

Fixes: 72aeb243a52a ("ovsdb-idl: Tracking - preserve data for deleted rows.")
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

datapath: ovs_ct_exit to be done under ovs_lock

Upstream commit:
    commit 27de77cec985233bdf6546437b9761853265c505
    Author: Tonghao Zhang <xiangxia.m.yue@gmail.com>
    Date:   Fri Apr 17 02:57:31 2020 +0800

    net: openvswitch: ovs_ct_exit to be done under ovs_lock

    syzbot wrote:
    | =============================
    | WARNING: suspicious RCU usage
    | 5.7.0-rc1+ #45 Not tainted
    | -----------------------------
    | net/openvswitch/conntrack.c:1898 RCU-list traversed in non-reader section!!
    |
    | other info that might help us debug this:
    | rcu_scheduler_active = 2, debug_locks = 1
    | ...
    |
    | stack backtrace:
    | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014
    | Workqueue: netns cleanup_net
    | Call Trace:
    | ...
    | ovs_ct_exit
    | ovs_exit_net
    | ops_exit_list.isra.7
    | cleanup_net
    | process_one_work
    | worker_thread

    To avoid that warning, invoke the ovs_ct_exit under ovs_lock and add
    lockdep_ovsl_is_held as optional lockdep expression.

Link: https://lore.kernel.org/lkml/000000000000e642a905a0cbee6e@google.com
Fixes: 11efd5cb04a1 ("openvswitch: Support conntrack zone limit")
Cc: Pravin B Shelar <pshelar@ovn.org>
Cc: Yi-Hung Wei <yihung.wei@gmail.com>
Reported-by: syzbot+7ef50afd3a211f879112@syzkaller.appspotmail.com
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Fixes: cb2a5486a3a3 ("datapath: conntrack: Support conntrack zone limit")
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

compat: rcu: Add support for consolidated-RCU reader checking

Upstream commit:
    commit 28875945ba98d1b47a8a706812b6494d165bb0a0
    Author: Joel Fernandes (Google) <joel@joelfernandes.org>
    Date:   Tue Jul 16 18:12:22 2019 -0400

    rcu: Add support for consolidated-RCU reader checking

    This commit adds RCU-reader checks to list_for_each_entry_rcu() and
    hlist_for_each_entry_rcu().  These checks are optional, and are indicated
    by a lockdep expression passed to a new optional argument to these two
    macros.  If this optional lockdep expression is omitted, these two macros
    act as before, checking for an RCU read-side critical section.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
    [ paulmck: Update to eliminate return within macro and update comment. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Backport portion of upstream commit for hlist_for_each_entry_rcu() macro
so that it can be used in following bug fix.

Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ovsdb-idl: Fix iteration over tracked rows with no actual data.

When idl removes orphan rows, those rows are inserted into the
'track_list'.  This allows iterators such as *_FOR_EACH_TRACKED () to
return orphan rows that never had any data to the IDL user.  In this
case, it is difficult for the user to understand whether it is a row
with no data (there was no "insert" / "modify" for this row) or it is
a row with zero data (columns were cleared by DB transaction).

The main problem with this condition is that rows without data will
have NULL pointers instead of references that should be there according
to the database schema.  For example, ovn-controller might crash:

ERROR: AddressSanitizer: SEGV on unknown address 0x000000000100
       (pc 0x00000055e9b2 bp 0x7ffef6180880 sp 0x7ffef6180860 T0)
The signal is caused by a READ memory access.
Hint: address points to the zero page.
    #0 0x55e9b1 in handle_deleted_lport /controller/binding.c
    #1 0x55e903 in handle_deleted_vif_lport /controller/binding.c:2072:5
    #2 0x55e059 in binding_handle_port_binding_changes /controller/binding.c:2155:23
    #3 0x5a6395 in runtime_data_sb_port_binding_handler /controller/ovn-controller.c:1454:10
    #4 0x5e15b3 in engine_compute /lib/inc-proc-eng.c:306:18
    #5 0x5e0faf in engine_run_node /lib/inc-proc-eng.c:352:14
    #6 0x5e0e04 in engine_run /lib/inc-proc-eng.c:377:9
    #7 0x5a03de in main /controller/ovn-controller.c
    #8 0x7f4fd9c991a2 in __libc_start_main (/lib64/libc.so.6+0x271a2)
    #9 0x483f0d in _start (/controller/ovn-controller+0x483f0d)

It doesn't make much sense to return non-real rows to the user, so it's
best to exclude them from iteration.

Test included.  Without the fix, provided test will print empty orphan
rows that was never received by idl as tracked changes.

Fixes: 932104f483ef ("ovsdb-idl: Add support for change tracking.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Dumitru Ceara <dceara@redhat.com>

tests: Add overflow test for the sha1 library.

This is a unit test for the overflow detection issue fixed by commit
a1d2c5f5d9ed ("sha1: Fix algorithm for data bigger than 512 megabytes.")

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Paolo Valerio <pvalerio@redhat.com>
Tested-by: Paolo Valerio <pvalerio@redhat.com>

perf-counter: Split numbers in the output.

While trying to benchmark big functions, values could be longer than
12 digits. In this case all of them printed without spaces. It's
hard ot read.

Fixes: 619c3a42dc1e ("lib: add a hardware performance counter access library")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Greg Rose <gvrose8192@gmail.com>

checkpatch: Add check for a whitespace after cast.

Coding style says: "Put a space between the ``()`` used in a cast and
the expression whose type is cast: ``(void *) 0``.".
This style rule is frequently overlooked. Let's check for it.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ian Stokes <ian.stokes@intel.com>

travis: Keep only arm64 builds.

All other builds are covered by GitHub Actions now. This should
decrease time our jobs waiting in a queue due to reduced capacity of
travis-ci.org.

Acked-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

github: Add GitHub Actions workflow.

This is an initial version of GitHub Actions support.  It mostly
mimics our current Travis CI build matrix with slight differences.

The main issue is that we don't have ARM support here.

Minor difference that we can not install 32-bit versions of libunwind
and libunbound since those are not avaialble in repository.

Higher concurrency level allows to finish all tests less than in 20
minutes.  Which is 3 times faster than in Travis.

.travis folder renamed to .ci to highlight that it used not only for
Travis CI.  Travis CI support will be reduced to only test ARM builds
soon and will be completely removed when travis-ci.org will be turned
into read-only mode.

What happened to Travis CI:
https://mail.openvswitch.org/pipermail/ovs-dev/2020-November/377773.html

Acked-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ci: Don't use 'native' machine for DPDK cache.

It's possible that actual HW where CI is running is slightly different
between jobs. That makes all unit tests to fail with cached DPDK
builds due to 'Illegal instruction' crashes. Changing machine
type to 'default' to generate binaries as generic as possible and avoid
this kind of issues.

Changing the name of a cache version file, so we will not use old
'native' builds that are currently in cache.

Fixes: 7654a3ed0b38 ("travis: Cache DPDK build.")
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

python: Update build system to ensure dirs.py is created.

Update build system to ensure dirs.py is created when it is a
dependency for a build target. Also, update setup.py to
check for that dependency.

Fixes: 943c4a325045 ("python: set ovs.dirs variables with build system values")
Signed-off-by: Mark Gray <mark.d.gray@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

* Fix dh_installinit dh_installsystemd calls, and switch to debhelper-compat
12 (Closes: #961746).

Fix ifupdown.sh script (Closes: #964029).

Update scripts to support RHEL 7.9

Add support for RHEL7.9 GA release with kernel 3.10.0-1160

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>

ovsdb-idl: Return correct seqno from ovsdb_idl_db_set_condition().

If an IDL client sets the same monitor condition twice, the expected
seqno when the IDL contents are updated should be the same for both
calls.

In the following scenario:
1. Client calls ovsdb_idl_db_set_condition(db, table, cond1)
2. ovsdb_idl sends monitor_cond_change(cond1) but the server doesn't yet
reply.
3. Client calls ovsdb_idl_db_set_condition(db, table, cond1)

At step 3 the returned expected seqno should be the same as at step 1.
Similarly, if step 2 is skipped, i.e., the client calls sets
the condition twice in the same iteration, then both
ovsdb_idl_db_set_condition() calls should return the same value.

Fixes: 46437c5232bd ("ovsdb-idl: Enhance conditional monitoring API")
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ovsdb-idl: Fix *_is_new() IDL functions.

Currently all functions of the type *_is_new() always return
'false'. This patch resolves this issue by using the
'OVSDB_IDL_CHANGE_INSERT' 'change_seqno' instead of the
'OVSDB_IDL_CHANGE_MODIFY' 'change_seqno' to determine if a row
is new and by resetting the 'OVSDB_IDL_CHANGE_INSERT'
'change_seqno' on clear.

Further to this, the code is also updated to match the following
behaviour:

When a row is inserted, the 'OVSDB_IDL_CHANGE_INSERT'
'change_seqno' is updated to match the new database
change_seqno. The 'OVSDB_IDL_CHANGE_MODIFY' 'change_seqno'
is not set for inserted rows (only for updated rows).

At the end of a run, ovsdb_idl_db_track_clear() should be
called to clear all tracking information, this includes
resetting all row 'change_seqno' to zero. This will ensure
that subsequent runs will not see a previously 'new' row.

add_tracked_change_for_references() is updated to only
track rows that reference the current row.

Also, update unit tests in order to test the *_is_new(),
*_is_delete() functions.

Suggested-by: Dumitru Ceara <dceara@redhat.com>
Reported-at: https://bugzilla.redhat.com/1883562
Fixes: ca545a787ac0 ("ovsdb-idl.c: Increase seqno for change-tracking of table references.")
Signed-off-by: Mark Gray <mark.d.gray@redhat.com>
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ovsdb-idl.at: Return stream open_block python tests.

Invocations of CHECK_STREAM_OPEN_BLOCK_PY was accidentally removed
during python2 to python3 conversion.  So, these tests was not
checked since that time.

This change returns tests back.  CHECK_STREAM_OPEN_BLOCK_PY needed
updates, so instead I refactored function for C tests to be able to
perform python tests too.  Also, added test for python with IPv6.

Fixes: 1ca0323e7c29 ("Require Python 3 and remove support for Python 2.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Gaetan Rivet <grive@u256.net>

compat: Fix compile warning.

In ../compat/nf_conntrack_reasm.c nf_frags_cache_name is declared
if OVS_NF_DEFRAG6_BACKPORT is defined. However, later in the patch
it is only used if HAVE_INET_FRAGS_WITH_FRAGS_WORK is defined and
HAVE_INET_FRAGS_RND is not defined. This will cause a compile warning
about unused variables.

Fix it up by using the same defines that enable its use to decide
if it should be declared and avoid the compiler warning.

Fixes: 4a90b277baca ("compat: Fixup ipv6 fragmentation on 4.9.135+ kernels")
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

compat: Fix build issue on RHEL 7.7.

RHEL 7.2 introduced a KABI fixup in struct sk_buff for the name
change of l4_rxhash to l4_hash. Then patch
9ba57fc7cccc ("datapath: Add hash info to upcall") introduced a
compile error by using l4_hash and not fixing up the HAVE_L4_RXHASH
configuration flag.

Remove all references to HAVE_L4_RXHASH and always use l4_hash to
resolve the issue. This will break compilation on RHEL 7.0 and
RHEL 7.1 but dropping support for these older kernels shouldn't be
a problem.

Fixes: 9ba57fc7cccc ("datapath: Add hash info to upcall")
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

compat: Remove stale code.

Remove stale and unused code left over after support for kernels
older than 3.10 was removed.

Fixes: 8063e0958780 ("datapath: Drop support for kernel older than 3.10")
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

tests: Add parse-flow tests for MPLS fields.

Currently "ovs-ofctl parse-flows (NXM)" test doesn't test MPLS fields at all.

This commit adds a test for the the 4 MPLS fields (mpls_label, mpls_tc,
mpls_bos and mpls_ttl) to "ovs-ofctl parse-flows (NXM)" test.

Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ofp-actions: Fix userspace support for mpls_ttl.

Currently mpls_ttl is ignored when a flow is added because MFF_MPLS_TTL is
not handled in nx_put_raw().

This commit adds the correct handling of MFF_MPLS_TTL in nx_put_raw().

Fixes: bef3f465bcd5 ("openflow: Support matching and modifying MPLS TTL field.")
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

netdev-dpdk: Add option to configure VF MAC address.

In some cloud topologies, using DPDK VF representors in guest requires
configuring a VF before it is assigned to the guest.

A first basic option for such configuration is setting the VF MAC
address. Add a key 'dpdk-vf-mac' to the 'options' column of the Interface
table.

This option can be used as such:

$ ovs-vsctl add-port br0 dpdk-rep0 -- set Interface dpdk-rep0 type=dpdk \
options:dpdk-vf-mac=00:11:22:33:44:55

Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Eli Britstein <elibr@nvidia.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Gaetan Rivet <grive@u256.net>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

netdev-dpdk: Add ability to set MAC address.

It is possible to set the MAC address of DPDK ports by calling
rte_eth_dev_default_mac_addr_set(). OvS does not actually call
this function for non-internal ports, but the implementation is
exposed to be used in a later commit.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Gaetan Rivet <grive@u256.net>

python: Don't raise an Exception on failure to connect via SSL.

With other socket types, trying to connect and failing will return
an error code, but if an SSL Stream is used, then when
check_connection_completion(sock) is called, SSL will raise an
exception that doesn't derive from socket.error which is handled.

This adds handling for SSL.SysCallError which has the same
arguments as socket.error (errno, string). A future enhancement
could be to go through SSLStream class and implement error
checking for all of the possible exceptions similar to how
lib/stream-ssl.c's interpret_ssl_error() works across the various
methods that are implemented.

Fixes: d90ed7d65ba8 ("python: Add SSL support to the python ovs client library")
Signed-off-by: Terry Wilson <twilson@redhat.com>
Acked-by: Thomas Neuman <thomas.neuman@nutanix.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

netdev-offload-dpdk: Pass L4 proto-id to match in the L3 rte_flow_item.

The offload layer clears the L4 protocol mask in the L3 item, when the
L4 item is passed for matching, as an optimization. This can be confusing
while parsing the headers in the PMD. Also, the datapath flow specifies
this field to be matched. This optimization is best left to the PMD.
This patch restores the code to pass the L4 protocol type in L3 match.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Acked-by: Eli Britstein <elibr@mellanox.com>
Tested-by: Emma Finn <emma.finn@intel.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

AUTHORS: Add Fabrizio D'Angelo.

Signed-off-by: Fabrizio D'Angelo <fdangelo@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

lldp: correctly increase discarded count

Upstream commit:
    commit 32f0deeebc9172c3f5f4a4d02aab32e6904947f6
    Date: Sat, 18 Feb 2017 20:11:47 +0100

    lldpd: correctly increase discarded count

    When a frame cannot be decoded but has been guessed, increase the
    discarded count.

    Fix https://github.com/vincentbernat/lldpd/issues/223

Fixes: be53a5c447c3 ("auto-attach: Initial support for Auto-Attach standard")
Co-authored-by: Fabrizio D'Angelo <fdangelo@redhat.com>
Signed-off-by: Fabrizio D'Angelo <fdangelo@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

lldp: increase statsTLVsUnrecognizedTotal on unknown TLV

Upstream commit:
    commit 109bcd423cd560545ec7940d73a50c5584aebb0c
    Author: Vincent Bernat <vincent@bernat.ch>
    Date: Sat, 6 Apr 2019 21:17:25 +0200

    This was done for organization TLVs, but not for other TLVs.

    Fix https://github.com/vincentbernat/lldpd/issues/323

Fixes: be53a5c447c3 ("auto-attach: Initial support for Auto-Attach standard")
Signed-off-by: Fabrizio D'Angelo <fdangelo@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

lldp: fix a buffer overflow when handling management address TLV

Upstream commit:
    commit a8d8006c06d9ac16ebcf33295cbd625c0847ca9b
    Author: Vincent Bernat <vincent@bernat.im>
    Date: Sun, 4 Oct 2015 01:50:38 +0200

    lldp: fix a buffer overflow when handling management address TLV

    When a remote device was advertising a too large management address
    while still respecting TLV boundaries, lldpd would crash due to a buffer
    overflow. However, the buffer being a static one, this buffer overflow
    is not exploitable if hardening was not disabled. This bug exists since
    version 0.5.6.

Fixes: be53a5c447c3 ("auto-attach: Initial support for Auto-Attach standard")
Reported-by: Jonas Rudloff <jonas.t.rudloff@gmail.com>
Reported-at: https://github.com/openvswitch/ovs/pull/335
Co-authored-by: Fabrizio D'Angelo <fdangelo@redhat.com>
Signed-off-by: Fabrizio D'Angelo <fdangelo@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

lldp: Fix size of PEEK_DISCARD_UINT32()

Upstream commit:
    commit a8d8006c06d9ac16ebcf33295cbd625c0847ca9b
    Author: Jonas Johansson <jonasj76@gmail.com>
    Date:   Thu, 21 Apr 2016 11:50:06 +0200

    Fix size of PEEK_DISCARD_UINT32()

Signed-off-by: Jonas Johansson <jonasj76@gmail.com>
Fixes: be53a5c447c3 ("auto-attach: Initial support for Auto-Attach standard")
Reported-by: Jonas Rudloff <jonas.t.rudloff@gmail.com>
Reported-at: https://github.com/openvswitch/ovs/pull/336
Signed-off-by: Fabrizio D'Angelo <fdangelo@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

lldp: validate a bit more received LLDP frames

Upstream commit:
    commit 3aeae72b97716fddac290634fad02b952d981f17
    Author: Vincent Bernat <vincent@bernat.ch>
    Date:   Tue, 1 Oct 2019 21:42:42 +0200

    lldp: validate a bit more received LLDP frames

    Notably, we ensure the order and unicity of Chassis ID, Port ID and
    TTL TLV. For Chassis ID and Port ID, we also ensure the maximum size
    does not exceed 256.

    Fix https://github.com/vincentbernat/lldpd/issues/351

Fixes: be53a5c447c3 ("auto-attach: Initial support for Auto-Attach standard")
Signed-off-by: Aaron Conole <aconole@redhat.com>
Co-authored-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

sha1: Fix algorithm for data bigger than 512 megabytes.

In modern systems, size_t is 64 bits. There is a 32 bit overflow check
in sha1_update(), which will not work correctly, because compiler will
do an automatic cast to 64 bits, since size_t type variable is in the
expression. We do want however to lose data, since this is the whole
idea of this overflow check.

Because of this, computation of SHA-1 checksum will always be incorrect
for any data, that is bigger than 512 megabytes, which in bits is the
boundary of 32 bits integer.

In practice it means that any OVSDB transaction, bigger or equal to 512
megabytes, is considered corrupt and ovsdb-server will refuse to work
with the database file. This is especially critical for OVN southbound
database, since it tends to grow rapidly.

Fixes: 5eccf359391f ("Replace SHA-1 library with one that is clearly licensed.")
Signed-off-by: Renat Nurgaliyev <impleman@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ovsdb-idlc: Return expected sequence number while setting conditions.

ovsdb_idl_set_condition() returns a sequence number that can be used to
check if the requested conditions are acknowledged by the server.
However, database bindings do not return this value to the user, making
it impossible to check if the conditions are accepted.

Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

odp-util: Fix overflow of nested netlink attributes.

Length of nested attributes must be checked before storing to the
header.  If current length exceeds the maximum value parsing should
fail, otherwise the length value will be truncated leading to
corrupted netlink message and out-of-bound memory accesses:

  ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6310002cc838
         at pc 0x000000575470 bp 0x7ffc6c322d60 sp 0x7ffc6c322d58
  READ of size 1 at 0x6310002cc838 thread T0
  SCARINESS: 12 (1-byte-read-heap-buffer-overflow)
    #0 0x57546f in format_generic_odp_key lib/odp-util.c:2738:39
    #1 0x559e70 in check_attr_len lib/odp-util.c:3572:13
    #2 0x56581a in format_odp_key_attr lib/odp-util.c:4392:9
    #3 0x5563b9 in format_odp_action lib/odp-util.c:1192:9
    #4 0x555d75 in format_odp_actions lib/odp-util.c:1279:13
    ...

Fix that by checking the length of nested netlink attributes before
updating 'nla_len' inside the header.  Additionally introduced
assertion inside nl_msg_end_nested() to catch this kind of issues
before actual overflow happened.

Credit to OSS-Fuzz.

Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=20003
Fixes: 65da723b40a5 ("odp-util: Format tunnel attributes directly from netlink.")
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

python: set ovs.dirs variables with build system values

ovs/dirs.py should be auto-generated using the template
ovs/dirs.py.template at build time. This will set the
ovs.dirs python variables with a value specified by the
environment or, if the environment variable is not set, from
the build system.

Signed-off-by: Mark Gray <mark.d.gray@redhat.com>
Acked-By: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>

Documentation: update IPsec tutorial for F32

F32 requires the "python3-openvswitch" package now. Also, the
iptables chain "IN_FedoraServer_allow" does not exist on Fedora 32.

Signed-off-by: Mark Gray <mark.d.gray@redhat.com>
Acked-by: Eric Garver <eric@garver.life>
Acked-by: Ian Stokes <ian.stokes@intel.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>

AUTHORS: Update Roi Dayan

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>

netdev-offload-tc: Use single 'once' variable for probing tc features

There is no need for a 'once' variable per probe.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>

release-process: Policy for unmaintained branches.

While only 2 branches are formally maintained (LTS and latest release),
OVS team usually provides stable releases for other branches too, at
least for branches between LTS and latest.

When transition period ends for an old LTS, we, according to
backporting-patches.rst, could stop backporting bug fixes to branches
older than new LTS. While this might be OK for an upstream project
it doesn't sound like a user-friendly policy just because it means
that we're dropping support for branches released less than a year
ago.

Below addition to the release process might make the process a bit
smoother in terms that we will not drop support for not so old branches
even after the transition period, if committers will follow the
"as far as it goes" backporting policy. And we will provide stable
releases for these branches for at least 2 years (these releases could
be less frequent than releases on LTS branches).

After 2 year period (4 releases) committers are still free to backport
fixes they think are needed on older branches, however we will likely
not provide actual releases on these branches, unless it's specially
requested and discussed.

Additionally, "4 releases" policy aligns with the DPDK LTS support
policy, i.e. we will be able to validate and release last OVS releases
with the last available DPDK LTS, e.g. OVS 2.11 last stable release
will likely be released with the 18.11 EOL release validated.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Kevin Traynor <ktraynor@redhat.com>

release-process: Standardize designation of new LTS releases.

Standardize that we will mark a new release as LTS every two years
to avoid situation where we have a really old LTS branch that no-one
actually uses, but we have to support and provide releases for it.

This will also make release process more predictable, so users will
be able to rely on it and plan their upgrades accordingly.

As a bonus, 2 years support cycle kind of aligns with 2 years support
cycle of DPDK LTS releases.

Still keeping a window for us to discuss and avoid marking some
particular release as LTS in case of significant issues with it.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Kevin Traynor <ktraynor@redhat.com>

release-process: Add transition period for LTS releases.

While LTS change happens, according to release-process.rst, we're
immediately dropping support for the old LTS and, according to
backporting-patches.rst could stop backporting bug fixes to branches
older than new LTS. While this might be OK for an upstream project
(some upstream projects like QEMU doesn't support anything at all
except the last release) it doesn't sound like a user-friendly policy.

Below addition to the release process might make the process a bit
smoother in terms that we will continue support of branches a little
bit longer even after changing current LTS, i.e. providing at least a
minimal transition period (1 release frame) for users of old LTS.

Effectively, this change means that we will support branch-2.5 until
2.15 release, i.e. we will provide the last release, if any, on
branch-2.5 somewhere around Feb 2021. (I don't actually expect many
fixes there)

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Kevin Traynor <ktraynor@redhat.com>

releases: Mark 2.13 as a new LTS release.

2.5 release is 4.5 years old and I'm not aware of anyone who actually
uses it today. Release process documentation says that there is no
strict time period for nominating a new LTS release and that usually
it happens once in a two years. So, proposing to nominate 2.13 as
our new LTS release since it's a first release that doesn't include
OVN inside, so we will formally not have to support it in this
repository in case there are major issues that might be hard to fix.

Suggested-by: Ben Pfaff <blp@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Ian Stokes <ian.stokes@intel.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

dpctl: Add the option 'pmd' for dump-flows.

"ovs-appctl dpctl/dump-flows" added the option
"pmd" which allow user to dump pmd specified.

That option is useful to dump rules of pmd
when we have a large number of rules in dp.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Gaetan Rivet <grive@u256.net>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

netdev-offload-dpdk: Preserve HW statistics for modified flows.

In case of a flow modification, preserve the HW statistics of the old HW
flow to the new one.

Fixes: 3c7330ebf036 ("netdev-offload-dpdk: Support offload of output action.")
Signed-off-by: Eli Britstein <elibr@nvidia.com>
Reviewed-by: Gaetan Rivet <gaetanr@nvidia.com>
Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Tested-by: Emma Finn <emma.finn@intel.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ovsdb: Remove read permission of *.db from others.

Currently, when ovsdb *.db is created by ovsdb-tool it grants read
permission to others. This may incur security concerns, for example,
IPsec Pre-shared keys are stored in ovs-vsitchd.conf.db.
This patch addresses the concerns by removing permission for others.

Reported-by: Antonin Bas <abas@vmware.com>
Acked-by: Mark Gray <mark.d.gray@redhat.com>
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

raft: Make backlog thresholds configurable.

New appctl 'cluster/set-backlog-threshold' to configure thresholds
on backlog of raft jsonrpc connections. Could be used, for example,
in some extreme conditions where size of a database expected to be
very large, i.e. comparable with default 4GB threshold.

Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

raft: Set threshold on backlog for raft connections.

RAFT messages could be fairly big.  If something abnormal happens to
one of the servers in a cluster it may not be able to process all the
incoming messages in a timely manner.  This results in jsonrpc backlog
growth on the sender's side.  For example if follower gets many new
clients at once that it needs to serve, or it decides to take a
snapshot in a period of high number of database changes.
If backlog grows large enough it becomes harder and harder for follower
to process incoming raft messages, it sends outdated replies and
starts receiving snapshots and the whole raft log from the leader.
Sometimes backlog grows too high (60GB in this example):

      jsonrpc|INFO|excessive sending backlog, jsonrpc: ssl:<ip>,
                   num of msgs: 15370, backlog: 61731060773.

In this case OS might actually decide to kill the sender to free some
memory.  Anyway, It could take a lot of time for such a server to catch
up with the rest of the cluster if it has so much data to receive and
process.

Introducing backlog thresholds for jsonrpc connections.
If sending backlog will exceed particular values (500 messages or
4GB in size), connection will be dropped and re-created.  This will
allow to drop all the current backlog and start over increasing
chances of cluster recovery.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1888829
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ovs-bugtool: Fix crash when enable --ovs.

When enabling '--ovs' or when not using '-y', ovs-bugtool crashes due to
Traceback (most recent call last):
  File "/usr/local/sbin/ovs-bugtool", line 1410, in <module>
    sys.exit(main())
  File "/usr/local/sbin/ovs-bugtool", line 690, in main
    for (k, v) in data.items():
RuntimeError: dictionary changed size during iteration

The patch fixes it by making a copy of the key and value.

VMware-BZ: #2663359
Fixes: 1ca0323e7c29 ("Require Python 3 and remove support for Python 2.")
Acked-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

Documentation: Fix rendering of extra repo info for RHEL 8.

In commit a82083ee3091 ("Documentation: Add extra repo info for RHEL 8")
a newline was missing to correctly generate the code block to add
codeready-builder repository.

This commit adds the missing newline to correctly generate the code block
with the RHEL 8 codeready-builder instructions.

Fixes: a82083ee3091 ("Documentation: Add extra repo info for RHEL 8")
Acked-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

raft: Avoid having more than one snapshot in-flight.

Previous commit 8c2c503bdb0d ("raft: Avoid sending equal snapshots.")
took a "safe" approach to not send only exactly same snapshot
installation requests.  However, it doesn't make much sense to send
more than one snapshot at a time.  If obsolete snapshot installed,
leader will re-send the most recent one.

With this change leader will have only 1 snapshot in-flight per
connection.  This will reduce backlogs on raft connections in case
new snapshot created while 'install_snapshot_request' is in progress
or if election timer changed in that period.

Also, not tracking the exact 'install_snapshot_request' we've sent
allows to simplify the code.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1888829
Fixes: 8c2c503bdb0d ("raft: Avoid sending equal snapshots.")
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ovsdb-server: Reclaim heap memory after compaction.

Compaction happens at most once in 10 minutes.  That is a big time
interval for a heavy loaded ovsdb-server in cluster mode.
In 10 minutes raft logs could grow up to tens of thousands of entries
with tens of gigabytes in total size.
While compaction cleans up raft log entries, the memory in many cases
is not returned to the system, but kept in the heap of running
ovsdb-server process, and it could stay in this condition for a really
long time.  In the end one performance spike could lead to a fast
growth of the raft log and this memory will never (for a really long
time) be released to the system even if the database if empty.

Simple example how to reproduce with OVN sandbox:

1. make sandbox SANDBOXFLAGS='--nbdb-model=clustered --sbdb-model=clustered'

2. Run following script that creates 1 port group, adds 4000 acls and
   removes all of that in the end:

   # cat ../memory-test.sh
   pg_name=my_port_group
   export OVN_NB_DAEMON=$(ovn-nbctl --pidfile --detach --log-file -vsocket_util:off)
   ovn-nbctl pg-add $pg_name
   for i in $(seq 1 4000); do
     echo "Iteration: $i"
     ovn-nbctl --log acl-add $pg_name from-lport $i udp drop
   done
   ovn-nbctl acl-del $pg_name
   ovn-nbctl pg-del $pg_name
   ovs-appctl -t $(pwd)/sandbox/nb1 memory/show
   ovn-appctl -t ovn-nbctl exit
   ---

3. Stopping one of Northbound DB servers:
   ovs-appctl -t $(pwd)/sandbox/nb1 exit

   Make sure that ovsdb-server didn't compact the database before
   it was stopped.  Now we have a db file on disk that contains
   4000 fairly big transactions inside.

4. Trying to start same ovsdb-server with this file.

   # cd sandbox && ovsdb-server <...> nb1.db

   At this point ovsdb-server reads all the transactions from db
   file and performs all of them as fast as it can one by one.
   When it finishes this, raft log contains 4000 entries and
   ovsdb-server consumes (on my system) ~13GB of memory while
   database is empty.  And libc will likely never return this memory
   back to system, or, at least, will hold it for a really long time.

This patch adds a new command 'ovsdb-server/memory-trim-on-compaction'.
It's disabled by default, but once enabled, ovsdb-server will call
'malloc_trim(0)' after every successful compaction to try to return
unused heap memory back to system.  This is glibc-specific, so we
need to detect function availability in a build time.
Disabled by default since it adds from 1% to 30% (depending on the
current state) to the snapshot creation time and, also, next memory
allocations will likely require requests to kernel and that might be
slower.  Could be enabled by default later if considered broadly
beneficial.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1888829
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

raft: Add log length to the memory report.

In many cases a big part of a memory consumed by ovsdb-server process
is a raft log, so it's important to add its length to the memory
report.

Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ovsdb-idl: Add comment with program name to ovsdb_idl_loop transactions.

This can make it easier to see what daemon is committing transactions.
Sometimes, in OVN especially, it can be hard to guess.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Dumitru Ceara <dceara@redhat.com>

raft: Avoid annoying debug logs if raft is connected.

If debug logs enabled, "raft_is_connected: true" printed on every
call to raft_is_connected() which is way too frequently.
These messages are not very informative and only litters the log.

Let's log only disconnected state in a rate-limited way and only
log positive case once at the moment cluster becomes connected.

Fixes: 923f01cad678 ("raft.c: Set candidate_retrying if no leader elected since last election.")
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

raft: Fix error leak on failure while saving snapshot.

Error should be destroyed before return.

Fixes: 1b1d2e6daa56 ("ovsdb: Introduce experimental support for clustered databases.")
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

packets: Un-inline functions needed by DDlog.

DDlog uses these functions from Rust, but Rust can't use inline
functions (since it doesn't compile C headers but only links
against a C-compatible ABI). Thus, move the implementations
of these functions to a .c file.

I don't think any of these functions is likely to be an
important part of a "fast path" in OVS, but if that's wrong,
then we could take another approach.

Signed-off-by: Leonid Ryzhyk <lryzhyk@vmware.com>
Co-authored-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Numan Siddique <numans@ovn.org>

NEWS: Move GTP-U entry to correct release.

GTP-U support was released in 2.14, not 2.13.

Fixes: 3c6d05a02e0f ("userspace: Add GTP-U support.")
Acked-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

raft: Report jsonrpc backlog in kilobytes.

While sending snapshots backlog on raft connections could quickly
grow over 4GB and this will overflow raft-backlog counter.

Let's report it in kB instead. (Using kB and not KB to match with
ru_maxrss counter reported by kernel)

Fixes: 3423cd97f88f ("ovsdb: Add raft memory usage to memory report.")
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

netdev-tc-offloads: Don't delete ufid mapping if fail to delete filter

tc_replace_flower may fail, so the return value must be checked.
If not zero, ufid can't be deleted. Otherwise the operations on this
filter may fail because its ufid is not found.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>

travis: Fix kernel download retry.

wget stops retrying to download a file when hitting fatal http errors
like 503.
But if a previous try had resulted in a partially downloaded ${file}, the
next wget call tries to download to ${file}.1.

Example:
+wget https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.16.18.tar.xz
--2020-03-18 20:51:42--  https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.16.18.tar.xz
Resolving cdn.kernel.org (cdn.kernel.org)... 151.101.1.176, 151.101.65.176, 151.101.129.176, ...
Connecting to cdn.kernel.org (cdn.kernel.org)|151.101.1.176|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 103076276 (98M) [application/x-xz]
Saving to: ‘linux-4.16.18.tar.xz’

linux-4.16.18.tar.x   0%[                    ]  13.07K  --.-KB/s    in 0s

2020-03-18 20:54:44 (133 MB/s) - Read error at byte 13383/103076276 (Connection reset by peer). Retrying.

--2020-03-18 20:54:45--  (try: 2)  https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.16.18.tar.xz
Connecting to cdn.kernel.org (cdn.kernel.org)|151.101.1.176|:443... connected.
HTTP request sent, awaiting response... 503 first byte timeout
2020-03-18 20:55:46 ERROR 503: first byte timeout.

+wget https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.16.18.tar.xz
--2020-03-18 20:55:46--  https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.16.18.tar.xz
Resolving cdn.kernel.org (cdn.kernel.org)... 151.101.1.176, 151.101.65.176, 151.101.129.176, ...
Connecting to cdn.kernel.org (cdn.kernel.org)|151.101.1.176|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 103076276 (98M) [application/x-xz]
Saving to: ‘linux-4.16.18.tar.xz.1’

linux-4.16.18.tar.x 100%[===================>]  98.30M   186MB/s    in 0.5s

2020-03-18 20:55:56 (186 MB/s) - ‘linux-4.16.18.tar.xz.1’ saved [103076276/103076276]

Fixes: 048674b45f4b ("travis: Retry kernel download on 503 first byte timeout.")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

Eliminate use of term "slave" in bond, LACP, and bundle contexts.

The new term is "member".

Most of these changes should not change user-visible behavior.  One
place where they do is in "ovs-ofctl dump-flows", which will now output
"members:..." inside "bundle" actions instead of "slaves:...".  I don't
expect this to cause real problems in most systems.  The old syntax
is still supported on input for backward compatibility.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>