git.proxmox.com Git - mirror

Update scripts to support RHEL 7.9

Add support for RHEL7.9 GA release with kernel 3.10.0-1160

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>

ovsdb-idl: Return correct seqno from ovsdb_idl_db_set_condition().

If an IDL client sets the same monitor condition twice, the expected
seqno when the IDL contents are updated should be the same for both
calls.

In the following scenario:
1. Client calls ovsdb_idl_db_set_condition(db, table, cond1)
2. ovsdb_idl sends monitor_cond_change(cond1) but the server doesn't yet
reply.
3. Client calls ovsdb_idl_db_set_condition(db, table, cond1)

At step 3 the returned expected seqno should be the same as at step 1.
Similarly, if step 2 is skipped, i.e., the client calls sets
the condition twice in the same iteration, then both
ovsdb_idl_db_set_condition() calls should return the same value.

Fixes: 46437c5232bd ("ovsdb-idl: Enhance conditional monitoring API")
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ovsdb-idl: Fix *_is_new() IDL functions.

Currently all functions of the type *_is_new() always return
'false'. This patch resolves this issue by using the
'OVSDB_IDL_CHANGE_INSERT' 'change_seqno' instead of the
'OVSDB_IDL_CHANGE_MODIFY' 'change_seqno' to determine if a row
is new and by resetting the 'OVSDB_IDL_CHANGE_INSERT'
'change_seqno' on clear.

Further to this, the code is also updated to match the following
behaviour:

When a row is inserted, the 'OVSDB_IDL_CHANGE_INSERT'
'change_seqno' is updated to match the new database
change_seqno. The 'OVSDB_IDL_CHANGE_MODIFY' 'change_seqno'
is not set for inserted rows (only for updated rows).

At the end of a run, ovsdb_idl_db_track_clear() should be
called to clear all tracking information, this includes
resetting all row 'change_seqno' to zero. This will ensure
that subsequent runs will not see a previously 'new' row.

add_tracked_change_for_references() is updated to only
track rows that reference the current row.

Also, update unit tests in order to test the *_is_new(),
*_is_delete() functions.

Suggested-by: Dumitru Ceara <dceara@redhat.com>
Reported-at: https://bugzilla.redhat.com/1883562
Fixes: ca545a787ac0 ("ovsdb-idl.c: Increase seqno for change-tracking of table references.")
Signed-off-by: Mark Gray <mark.d.gray@redhat.com>
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ovsdb-idl.at: Return stream open_block python tests.

Invocations of CHECK_STREAM_OPEN_BLOCK_PY was accidentally removed
during python2 to python3 conversion.  So, these tests was not
checked since that time.

This change returns tests back.  CHECK_STREAM_OPEN_BLOCK_PY needed
updates, so instead I refactored function for C tests to be able to
perform python tests too.  Also, added test for python with IPv6.

Fixes: 1ca0323e7c29 ("Require Python 3 and remove support for Python 2.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Gaetan Rivet <grive@u256.net>

compat: Fix compile warning.

In ../compat/nf_conntrack_reasm.c nf_frags_cache_name is declared
if OVS_NF_DEFRAG6_BACKPORT is defined. However, later in the patch
it is only used if HAVE_INET_FRAGS_WITH_FRAGS_WORK is defined and
HAVE_INET_FRAGS_RND is not defined. This will cause a compile warning
about unused variables.

Fix it up by using the same defines that enable its use to decide
if it should be declared and avoid the compiler warning.

Fixes: 4a90b277baca ("compat: Fixup ipv6 fragmentation on 4.9.135+ kernels")
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

compat: Fix build issue on RHEL 7.7.

RHEL 7.2 introduced a KABI fixup in struct sk_buff for the name
change of l4_rxhash to l4_hash. Then patch
9ba57fc7cccc ("datapath: Add hash info to upcall") introduced a
compile error by using l4_hash and not fixing up the HAVE_L4_RXHASH
configuration flag.

Remove all references to HAVE_L4_RXHASH and always use l4_hash to
resolve the issue. This will break compilation on RHEL 7.0 and
RHEL 7.1 but dropping support for these older kernels shouldn't be
a problem.

Fixes: 9ba57fc7cccc ("datapath: Add hash info to upcall")
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

compat: Remove stale code.

Remove stale and unused code left over after support for kernels
older than 3.10 was removed.

Fixes: 8063e0958780 ("datapath: Drop support for kernel older than 3.10")
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

tests: Add parse-flow tests for MPLS fields.

Currently "ovs-ofctl parse-flows (NXM)" test doesn't test MPLS fields at all.

This commit adds a test for the the 4 MPLS fields (mpls_label, mpls_tc,
mpls_bos and mpls_ttl) to "ovs-ofctl parse-flows (NXM)" test.

Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ofp-actions: Fix userspace support for mpls_ttl.

Currently mpls_ttl is ignored when a flow is added because MFF_MPLS_TTL is
not handled in nx_put_raw().

This commit adds the correct handling of MFF_MPLS_TTL in nx_put_raw().

Fixes: bef3f465bcd5 ("openflow: Support matching and modifying MPLS TTL field.")
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

netdev-dpdk: Add option to configure VF MAC address.

In some cloud topologies, using DPDK VF representors in guest requires
configuring a VF before it is assigned to the guest.

A first basic option for such configuration is setting the VF MAC
address. Add a key 'dpdk-vf-mac' to the 'options' column of the Interface
table.

This option can be used as such:

$ ovs-vsctl add-port br0 dpdk-rep0 -- set Interface dpdk-rep0 type=dpdk \
options:dpdk-vf-mac=00:11:22:33:44:55

Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Eli Britstein <elibr@nvidia.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Gaetan Rivet <grive@u256.net>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

netdev-dpdk: Add ability to set MAC address.

It is possible to set the MAC address of DPDK ports by calling
rte_eth_dev_default_mac_addr_set(). OvS does not actually call
this function for non-internal ports, but the implementation is
exposed to be used in a later commit.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Gaetan Rivet <grive@u256.net>

python: Don't raise an Exception on failure to connect via SSL.

With other socket types, trying to connect and failing will return
an error code, but if an SSL Stream is used, then when
check_connection_completion(sock) is called, SSL will raise an
exception that doesn't derive from socket.error which is handled.

This adds handling for SSL.SysCallError which has the same
arguments as socket.error (errno, string). A future enhancement
could be to go through SSLStream class and implement error
checking for all of the possible exceptions similar to how
lib/stream-ssl.c's interpret_ssl_error() works across the various
methods that are implemented.

Fixes: d90ed7d65ba8 ("python: Add SSL support to the python ovs client library")
Signed-off-by: Terry Wilson <twilson@redhat.com>
Acked-by: Thomas Neuman <thomas.neuman@nutanix.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

netdev-offload-dpdk: Pass L4 proto-id to match in the L3 rte_flow_item.

The offload layer clears the L4 protocol mask in the L3 item, when the
L4 item is passed for matching, as an optimization. This can be confusing
while parsing the headers in the PMD. Also, the datapath flow specifies
this field to be matched. This optimization is best left to the PMD.
This patch restores the code to pass the L4 protocol type in L3 match.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Acked-by: Eli Britstein <elibr@mellanox.com>
Tested-by: Emma Finn <emma.finn@intel.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

AUTHORS: Add Fabrizio D'Angelo.

Signed-off-by: Fabrizio D'Angelo <fdangelo@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

lldp: correctly increase discarded count

Upstream commit:
    commit 32f0deeebc9172c3f5f4a4d02aab32e6904947f6
    Date: Sat, 18 Feb 2017 20:11:47 +0100

    lldpd: correctly increase discarded count

    When a frame cannot be decoded but has been guessed, increase the
    discarded count.

    Fix https://github.com/vincentbernat/lldpd/issues/223

Fixes: be53a5c447c3 ("auto-attach: Initial support for Auto-Attach standard")
Co-authored-by: Fabrizio D'Angelo <fdangelo@redhat.com>
Signed-off-by: Fabrizio D'Angelo <fdangelo@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

lldp: increase statsTLVsUnrecognizedTotal on unknown TLV

Upstream commit:
    commit 109bcd423cd560545ec7940d73a50c5584aebb0c
    Author: Vincent Bernat <vincent@bernat.ch>
    Date: Sat, 6 Apr 2019 21:17:25 +0200

    This was done for organization TLVs, but not for other TLVs.

    Fix https://github.com/vincentbernat/lldpd/issues/323

Fixes: be53a5c447c3 ("auto-attach: Initial support for Auto-Attach standard")
Signed-off-by: Fabrizio D'Angelo <fdangelo@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

lldp: fix a buffer overflow when handling management address TLV

Upstream commit:
    commit a8d8006c06d9ac16ebcf33295cbd625c0847ca9b
    Author: Vincent Bernat <vincent@bernat.im>
    Date: Sun, 4 Oct 2015 01:50:38 +0200

    lldp: fix a buffer overflow when handling management address TLV

    When a remote device was advertising a too large management address
    while still respecting TLV boundaries, lldpd would crash due to a buffer
    overflow. However, the buffer being a static one, this buffer overflow
    is not exploitable if hardening was not disabled. This bug exists since
    version 0.5.6.

Fixes: be53a5c447c3 ("auto-attach: Initial support for Auto-Attach standard")
Reported-by: Jonas Rudloff <jonas.t.rudloff@gmail.com>
Reported-at: https://github.com/openvswitch/ovs/pull/335
Co-authored-by: Fabrizio D'Angelo <fdangelo@redhat.com>
Signed-off-by: Fabrizio D'Angelo <fdangelo@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

lldp: Fix size of PEEK_DISCARD_UINT32()

Upstream commit:
    commit a8d8006c06d9ac16ebcf33295cbd625c0847ca9b
    Author: Jonas Johansson <jonasj76@gmail.com>
    Date:   Thu, 21 Apr 2016 11:50:06 +0200

    Fix size of PEEK_DISCARD_UINT32()

Signed-off-by: Jonas Johansson <jonasj76@gmail.com>
Fixes: be53a5c447c3 ("auto-attach: Initial support for Auto-Attach standard")
Reported-by: Jonas Rudloff <jonas.t.rudloff@gmail.com>
Reported-at: https://github.com/openvswitch/ovs/pull/336
Signed-off-by: Fabrizio D'Angelo <fdangelo@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

lldp: validate a bit more received LLDP frames

Upstream commit:
    commit 3aeae72b97716fddac290634fad02b952d981f17
    Author: Vincent Bernat <vincent@bernat.ch>
    Date:   Tue, 1 Oct 2019 21:42:42 +0200

    lldp: validate a bit more received LLDP frames

    Notably, we ensure the order and unicity of Chassis ID, Port ID and
    TTL TLV. For Chassis ID and Port ID, we also ensure the maximum size
    does not exceed 256.

    Fix https://github.com/vincentbernat/lldpd/issues/351

Fixes: be53a5c447c3 ("auto-attach: Initial support for Auto-Attach standard")
Signed-off-by: Aaron Conole <aconole@redhat.com>
Co-authored-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

sha1: Fix algorithm for data bigger than 512 megabytes.

In modern systems, size_t is 64 bits. There is a 32 bit overflow check
in sha1_update(), which will not work correctly, because compiler will
do an automatic cast to 64 bits, since size_t type variable is in the
expression. We do want however to lose data, since this is the whole
idea of this overflow check.

Because of this, computation of SHA-1 checksum will always be incorrect
for any data, that is bigger than 512 megabytes, which in bits is the
boundary of 32 bits integer.

In practice it means that any OVSDB transaction, bigger or equal to 512
megabytes, is considered corrupt and ovsdb-server will refuse to work
with the database file. This is especially critical for OVN southbound
database, since it tends to grow rapidly.

Fixes: 5eccf359391f ("Replace SHA-1 library with one that is clearly licensed.")
Signed-off-by: Renat Nurgaliyev <impleman@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ovsdb-idlc: Return expected sequence number while setting conditions.

ovsdb_idl_set_condition() returns a sequence number that can be used to
check if the requested conditions are acknowledged by the server.
However, database bindings do not return this value to the user, making
it impossible to check if the conditions are accepted.

Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

odp-util: Fix overflow of nested netlink attributes.

Length of nested attributes must be checked before storing to the
header.  If current length exceeds the maximum value parsing should
fail, otherwise the length value will be truncated leading to
corrupted netlink message and out-of-bound memory accesses:

  ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6310002cc838
         at pc 0x000000575470 bp 0x7ffc6c322d60 sp 0x7ffc6c322d58
  READ of size 1 at 0x6310002cc838 thread T0
  SCARINESS: 12 (1-byte-read-heap-buffer-overflow)
    #0 0x57546f in format_generic_odp_key lib/odp-util.c:2738:39
    #1 0x559e70 in check_attr_len lib/odp-util.c:3572:13
    #2 0x56581a in format_odp_key_attr lib/odp-util.c:4392:9
    #3 0x5563b9 in format_odp_action lib/odp-util.c:1192:9
    #4 0x555d75 in format_odp_actions lib/odp-util.c:1279:13
    ...

Fix that by checking the length of nested netlink attributes before
updating 'nla_len' inside the header.  Additionally introduced
assertion inside nl_msg_end_nested() to catch this kind of issues
before actual overflow happened.

Credit to OSS-Fuzz.

Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=20003
Fixes: 65da723b40a5 ("odp-util: Format tunnel attributes directly from netlink.")
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

python: set ovs.dirs variables with build system values

ovs/dirs.py should be auto-generated using the template
ovs/dirs.py.template at build time. This will set the
ovs.dirs python variables with a value specified by the
environment or, if the environment variable is not set, from
the build system.

Signed-off-by: Mark Gray <mark.d.gray@redhat.com>
Acked-By: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>

Documentation: update IPsec tutorial for F32

F32 requires the "python3-openvswitch" package now. Also, the
iptables chain "IN_FedoraServer_allow" does not exist on Fedora 32.

Signed-off-by: Mark Gray <mark.d.gray@redhat.com>
Acked-by: Eric Garver <eric@garver.life>
Acked-by: Ian Stokes <ian.stokes@intel.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>

AUTHORS: Update Roi Dayan

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>

netdev-offload-tc: Use single 'once' variable for probing tc features

There is no need for a 'once' variable per probe.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>

release-process: Policy for unmaintained branches.

While only 2 branches are formally maintained (LTS and latest release),
OVS team usually provides stable releases for other branches too, at
least for branches between LTS and latest.

When transition period ends for an old LTS, we, according to
backporting-patches.rst, could stop backporting bug fixes to branches
older than new LTS. While this might be OK for an upstream project
it doesn't sound like a user-friendly policy just because it means
that we're dropping support for branches released less than a year
ago.

Below addition to the release process might make the process a bit
smoother in terms that we will not drop support for not so old branches
even after the transition period, if committers will follow the
"as far as it goes" backporting policy. And we will provide stable
releases for these branches for at least 2 years (these releases could
be less frequent than releases on LTS branches).

After 2 year period (4 releases) committers are still free to backport
fixes they think are needed on older branches, however we will likely
not provide actual releases on these branches, unless it's specially
requested and discussed.

Additionally, "4 releases" policy aligns with the DPDK LTS support
policy, i.e. we will be able to validate and release last OVS releases
with the last available DPDK LTS, e.g. OVS 2.11 last stable release
will likely be released with the 18.11 EOL release validated.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Kevin Traynor <ktraynor@redhat.com>

release-process: Standardize designation of new LTS releases.

Standardize that we will mark a new release as LTS every two years
to avoid situation where we have a really old LTS branch that no-one
actually uses, but we have to support and provide releases for it.

This will also make release process more predictable, so users will
be able to rely on it and plan their upgrades accordingly.

As a bonus, 2 years support cycle kind of aligns with 2 years support
cycle of DPDK LTS releases.

Still keeping a window for us to discuss and avoid marking some
particular release as LTS in case of significant issues with it.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Kevin Traynor <ktraynor@redhat.com>

release-process: Add transition period for LTS releases.

While LTS change happens, according to release-process.rst, we're
immediately dropping support for the old LTS and, according to
backporting-patches.rst could stop backporting bug fixes to branches
older than new LTS. While this might be OK for an upstream project
(some upstream projects like QEMU doesn't support anything at all
except the last release) it doesn't sound like a user-friendly policy.

Below addition to the release process might make the process a bit
smoother in terms that we will continue support of branches a little
bit longer even after changing current LTS, i.e. providing at least a
minimal transition period (1 release frame) for users of old LTS.

Effectively, this change means that we will support branch-2.5 until
2.15 release, i.e. we will provide the last release, if any, on
branch-2.5 somewhere around Feb 2021. (I don't actually expect many
fixes there)

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Kevin Traynor <ktraynor@redhat.com>

releases: Mark 2.13 as a new LTS release.

2.5 release is 4.5 years old and I'm not aware of anyone who actually
uses it today. Release process documentation says that there is no
strict time period for nominating a new LTS release and that usually
it happens once in a two years. So, proposing to nominate 2.13 as
our new LTS release since it's a first release that doesn't include
OVN inside, so we will formally not have to support it in this
repository in case there are major issues that might be hard to fix.

Suggested-by: Ben Pfaff <blp@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Ian Stokes <ian.stokes@intel.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

dpctl: Add the option 'pmd' for dump-flows.

"ovs-appctl dpctl/dump-flows" added the option
"pmd" which allow user to dump pmd specified.

That option is useful to dump rules of pmd
when we have a large number of rules in dp.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Gaetan Rivet <grive@u256.net>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

netdev-offload-dpdk: Preserve HW statistics for modified flows.

In case of a flow modification, preserve the HW statistics of the old HW
flow to the new one.

Fixes: 3c7330ebf036 ("netdev-offload-dpdk: Support offload of output action.")
Signed-off-by: Eli Britstein <elibr@nvidia.com>
Reviewed-by: Gaetan Rivet <gaetanr@nvidia.com>
Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Tested-by: Emma Finn <emma.finn@intel.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ovsdb: Remove read permission of *.db from others.

Currently, when ovsdb *.db is created by ovsdb-tool it grants read
permission to others. This may incur security concerns, for example,
IPsec Pre-shared keys are stored in ovs-vsitchd.conf.db.
This patch addresses the concerns by removing permission for others.

Reported-by: Antonin Bas <abas@vmware.com>
Acked-by: Mark Gray <mark.d.gray@redhat.com>
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

raft: Make backlog thresholds configurable.

New appctl 'cluster/set-backlog-threshold' to configure thresholds
on backlog of raft jsonrpc connections. Could be used, for example,
in some extreme conditions where size of a database expected to be
very large, i.e. comparable with default 4GB threshold.

Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

raft: Set threshold on backlog for raft connections.

RAFT messages could be fairly big.  If something abnormal happens to
one of the servers in a cluster it may not be able to process all the
incoming messages in a timely manner.  This results in jsonrpc backlog
growth on the sender's side.  For example if follower gets many new
clients at once that it needs to serve, or it decides to take a
snapshot in a period of high number of database changes.
If backlog grows large enough it becomes harder and harder for follower
to process incoming raft messages, it sends outdated replies and
starts receiving snapshots and the whole raft log from the leader.
Sometimes backlog grows too high (60GB in this example):

      jsonrpc|INFO|excessive sending backlog, jsonrpc: ssl:<ip>,
                   num of msgs: 15370, backlog: 61731060773.

In this case OS might actually decide to kill the sender to free some
memory.  Anyway, It could take a lot of time for such a server to catch
up with the rest of the cluster if it has so much data to receive and
process.

Introducing backlog thresholds for jsonrpc connections.
If sending backlog will exceed particular values (500 messages or
4GB in size), connection will be dropped and re-created.  This will
allow to drop all the current backlog and start over increasing
chances of cluster recovery.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1888829
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ovs-bugtool: Fix crash when enable --ovs.

When enabling '--ovs' or when not using '-y', ovs-bugtool crashes due to
Traceback (most recent call last):
  File "/usr/local/sbin/ovs-bugtool", line 1410, in <module>
    sys.exit(main())
  File "/usr/local/sbin/ovs-bugtool", line 690, in main
    for (k, v) in data.items():
RuntimeError: dictionary changed size during iteration

The patch fixes it by making a copy of the key and value.

VMware-BZ: #2663359
Fixes: 1ca0323e7c29 ("Require Python 3 and remove support for Python 2.")
Acked-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

Documentation: Fix rendering of extra repo info for RHEL 8.

In commit a82083ee3091 ("Documentation: Add extra repo info for RHEL 8")
a newline was missing to correctly generate the code block to add
codeready-builder repository.

This commit adds the missing newline to correctly generate the code block
with the RHEL 8 codeready-builder instructions.

Fixes: a82083ee3091 ("Documentation: Add extra repo info for RHEL 8")
Acked-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

raft: Avoid having more than one snapshot in-flight.

Previous commit 8c2c503bdb0d ("raft: Avoid sending equal snapshots.")
took a "safe" approach to not send only exactly same snapshot
installation requests.  However, it doesn't make much sense to send
more than one snapshot at a time.  If obsolete snapshot installed,
leader will re-send the most recent one.

With this change leader will have only 1 snapshot in-flight per
connection.  This will reduce backlogs on raft connections in case
new snapshot created while 'install_snapshot_request' is in progress
or if election timer changed in that period.

Also, not tracking the exact 'install_snapshot_request' we've sent
allows to simplify the code.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1888829
Fixes: 8c2c503bdb0d ("raft: Avoid sending equal snapshots.")
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ovsdb-server: Reclaim heap memory after compaction.

Compaction happens at most once in 10 minutes.  That is a big time
interval for a heavy loaded ovsdb-server in cluster mode.
In 10 minutes raft logs could grow up to tens of thousands of entries
with tens of gigabytes in total size.
While compaction cleans up raft log entries, the memory in many cases
is not returned to the system, but kept in the heap of running
ovsdb-server process, and it could stay in this condition for a really
long time.  In the end one performance spike could lead to a fast
growth of the raft log and this memory will never (for a really long
time) be released to the system even if the database if empty.

Simple example how to reproduce with OVN sandbox:

1. make sandbox SANDBOXFLAGS='--nbdb-model=clustered --sbdb-model=clustered'

2. Run following script that creates 1 port group, adds 4000 acls and
   removes all of that in the end:

   # cat ../memory-test.sh
   pg_name=my_port_group
   export OVN_NB_DAEMON=$(ovn-nbctl --pidfile --detach --log-file -vsocket_util:off)
   ovn-nbctl pg-add $pg_name
   for i in $(seq 1 4000); do
     echo "Iteration: $i"
     ovn-nbctl --log acl-add $pg_name from-lport $i udp drop
   done
   ovn-nbctl acl-del $pg_name
   ovn-nbctl pg-del $pg_name
   ovs-appctl -t $(pwd)/sandbox/nb1 memory/show
   ovn-appctl -t ovn-nbctl exit
   ---

3. Stopping one of Northbound DB servers:
   ovs-appctl -t $(pwd)/sandbox/nb1 exit

   Make sure that ovsdb-server didn't compact the database before
   it was stopped.  Now we have a db file on disk that contains
   4000 fairly big transactions inside.

4. Trying to start same ovsdb-server with this file.

   # cd sandbox && ovsdb-server <...> nb1.db

   At this point ovsdb-server reads all the transactions from db
   file and performs all of them as fast as it can one by one.
   When it finishes this, raft log contains 4000 entries and
   ovsdb-server consumes (on my system) ~13GB of memory while
   database is empty.  And libc will likely never return this memory
   back to system, or, at least, will hold it for a really long time.

This patch adds a new command 'ovsdb-server/memory-trim-on-compaction'.
It's disabled by default, but once enabled, ovsdb-server will call
'malloc_trim(0)' after every successful compaction to try to return
unused heap memory back to system.  This is glibc-specific, so we
need to detect function availability in a build time.
Disabled by default since it adds from 1% to 30% (depending on the
current state) to the snapshot creation time and, also, next memory
allocations will likely require requests to kernel and that might be
slower.  Could be enabled by default later if considered broadly
beneficial.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1888829
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

raft: Add log length to the memory report.

In many cases a big part of a memory consumed by ovsdb-server process
is a raft log, so it's important to add its length to the memory
report.

Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ovsdb-idl: Add comment with program name to ovsdb_idl_loop transactions.

This can make it easier to see what daemon is committing transactions.
Sometimes, in OVN especially, it can be hard to guess.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Dumitru Ceara <dceara@redhat.com>

raft: Avoid annoying debug logs if raft is connected.

If debug logs enabled, "raft_is_connected: true" printed on every
call to raft_is_connected() which is way too frequently.
These messages are not very informative and only litters the log.

Let's log only disconnected state in a rate-limited way and only
log positive case once at the moment cluster becomes connected.

Fixes: 923f01cad678 ("raft.c: Set candidate_retrying if no leader elected since last election.")
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

raft: Fix error leak on failure while saving snapshot.

Error should be destroyed before return.

Fixes: 1b1d2e6daa56 ("ovsdb: Introduce experimental support for clustered databases.")
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

packets: Un-inline functions needed by DDlog.

DDlog uses these functions from Rust, but Rust can't use inline
functions (since it doesn't compile C headers but only links
against a C-compatible ABI). Thus, move the implementations
of these functions to a .c file.

I don't think any of these functions is likely to be an
important part of a "fast path" in OVS, but if that's wrong,
then we could take another approach.

Signed-off-by: Leonid Ryzhyk <lryzhyk@vmware.com>
Co-authored-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Numan Siddique <numans@ovn.org>

NEWS: Move GTP-U entry to correct release.

GTP-U support was released in 2.14, not 2.13.

Fixes: 3c6d05a02e0f ("userspace: Add GTP-U support.")
Acked-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

raft: Report jsonrpc backlog in kilobytes.

While sending snapshots backlog on raft connections could quickly
grow over 4GB and this will overflow raft-backlog counter.

Let's report it in kB instead. (Using kB and not KB to match with
ru_maxrss counter reported by kernel)

Fixes: 3423cd97f88f ("ovsdb: Add raft memory usage to memory report.")
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

netdev-tc-offloads: Don't delete ufid mapping if fail to delete filter

tc_replace_flower may fail, so the return value must be checked.
If not zero, ufid can't be deleted. Otherwise the operations on this
filter may fail because its ufid is not found.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>

travis: Fix kernel download retry.

wget stops retrying to download a file when hitting fatal http errors
like 503.
But if a previous try had resulted in a partially downloaded ${file}, the
next wget call tries to download to ${file}.1.

Example:
+wget https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.16.18.tar.xz
--2020-03-18 20:51:42--  https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.16.18.tar.xz
Resolving cdn.kernel.org (cdn.kernel.org)... 151.101.1.176, 151.101.65.176, 151.101.129.176, ...
Connecting to cdn.kernel.org (cdn.kernel.org)|151.101.1.176|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 103076276 (98M) [application/x-xz]
Saving to: ‘linux-4.16.18.tar.xz’

linux-4.16.18.tar.x   0%[                    ]  13.07K  --.-KB/s    in 0s

2020-03-18 20:54:44 (133 MB/s) - Read error at byte 13383/103076276 (Connection reset by peer). Retrying.

--2020-03-18 20:54:45--  (try: 2)  https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.16.18.tar.xz
Connecting to cdn.kernel.org (cdn.kernel.org)|151.101.1.176|:443... connected.
HTTP request sent, awaiting response... 503 first byte timeout
2020-03-18 20:55:46 ERROR 503: first byte timeout.

+wget https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.16.18.tar.xz
--2020-03-18 20:55:46--  https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.16.18.tar.xz
Resolving cdn.kernel.org (cdn.kernel.org)... 151.101.1.176, 151.101.65.176, 151.101.129.176, ...
Connecting to cdn.kernel.org (cdn.kernel.org)|151.101.1.176|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 103076276 (98M) [application/x-xz]
Saving to: ‘linux-4.16.18.tar.xz.1’

linux-4.16.18.tar.x 100%[===================>]  98.30M   186MB/s    in 0.5s

2020-03-18 20:55:56 (186 MB/s) - ‘linux-4.16.18.tar.xz.1’ saved [103076276/103076276]

Fixes: 048674b45f4b ("travis: Retry kernel download on 503 first byte timeout.")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

Eliminate use of term "slave" in bond, LACP, and bundle contexts.

The new term is "member".

Most of these changes should not change user-visible behavior.  One
place where they do is in "ovs-ofctl dump-flows", which will now output
"members:..." inside "bundle" actions instead of "slaves:...".  I don't
expect this to cause real problems in most systems.  The old syntax
is still supported on input for backward compatibility.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>

conntrack: Rename "master" connection to "parent" connection.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>

NEWS: Move terminology update to correct place.

It's Post-v2.14.0, not v2.14.0.

Fixes: 807152a4ddfb ("Use primary/secondary, not master/slave, as names for OpenFlow roles.")
Acked-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

odp-util: Add missing comma after gtpu attributes.

Currently flows are printed like this:
'tunnel(gtpu(flags=0x7f,msgtype=0)flags(0))'
With this change:
'tunnel(gtpu(flags=0x7f,msgtype=0),flags(0))'

Fixes: 3c6d05a02e0f ("userspace: Add GTP-U support.")
Acked-by: Yi Yang <yangyi01@inspur.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

odp-util: Fix using uninitialized gtpu metadata.

If datapath flow doesn't have one of the fields of gtpu metadata, e.g.
'tunnel(gtpu())', uninitialized stack memory will be used instead.

==3485429==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x853a1b in format_u8x lib/odp-util.c:3474:13
    #1 0x86ee9c in format_odp_tun_gtpu_opt lib/odp-util.c:3713:5
    #2 0x86a099 in format_odp_tun_attr lib/odp-util.c:3973:13
    #3 0x83afe6 in format_odp_key_attr__ lib/odp-util.c:4179:9
    #4 0x838afb in odp_flow_format lib/odp-util.c:4563:17
    #5 0x738422 in log_flow_message lib/dpif.c:1750:5
    #6 0x738e2f in log_flow_put_message lib/dpif.c:1784:9
    #7 0x7371a4 in dpif_operate lib/dpif.c:1377:21
    #8 0x7363ef in dpif_flow_put lib/dpif.c:1035:5
    #9 0xc7aab7 in dpctl_put_flow lib/dpctl.c:1171:13
    #10 0xc65a4f in dpctl_unixctl_handler lib/dpctl.c:2701:17
    #11 0xaaad04 in process_command lib/unixctl.c:308:13
    #12 0xaa87f7 in run_connection lib/unixctl.c:342:17
    #13 0xaa842e in unixctl_server_run lib/unixctl.c:393:21
    #14 0x51c09c in main vswitchd/ovs-vswitchd.c:128:9
    #15 0x7f88344391a2 in __libc_start_main (/lib64/libc.so.6+0x271a2)
    #16 0x46b92d in _start (vswitchd/ovs-vswitchd+0x46b92d)

  Uninitialized value was stored to memory at
    #0 0x87da17 in scan_gtpu_metadata lib/odp-util.c:5221:27
    #1 0x874588 in parse_odp_key_mask_attr__ lib/odp-util.c:5862:9
    #2 0x83ee14 in parse_odp_key_mask_attr lib/odp-util.c:5808:18
    #3 0x83e8b5 in odp_flow_from_string lib/odp-util.c:6065:18
    #4 0xc7a4f3 in dpctl_put_flow lib/dpctl.c:1145:13
    #5 0xc65a4f in dpctl_unixctl_handler lib/dpctl.c:2701:17
    #6 0xaaad04 in process_command lib/unixctl.c:308:13
    #7 0xaa87f7 in run_connection lib/unixctl.c:342:17
    #8 0xaa842e in unixctl_server_run lib/unixctl.c:393:21
    #9 0x51c09c in main vswitchd/ovs-vswitchd.c:128:9
    #10 0x7f88344391a2 in __libc_start_main (/lib64/libc.so.6+0x271a2)

  Uninitialized value was created by an allocation of 'msgtype_ma' in the
  stack frame of function 'scan_gtpu_metadata'
    #0 0x87d440 in scan_gtpu_metadata lib/odp-util.c:5187

Fix that by initializing fields to all zeroes by default.

Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=21426
Fixes: 3c6d05a02e0f ("userspace: Add GTP-U support.")
Acked-by: Yi Yang <yangyi01@inspur.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

netdev-offload-dpdk: Support vxlan encap offload with load actions.

Struct match has the tunnel values/masks in
match->flow.tunnel/match->wc.masks.tunnel.
Load actions such as load:0xa566c10->NXM_NX_TUN_IPV4_DST[],
load:0xbba->NXM_NX_TUN_ID[] are utilizing the tunnel masks fields,
but those should not be used for matching.
Offloading fails if masks is not clear. Clear it if no tunnel used.

Fixes: e8a2b5bf92bb ("netdev-dpdk: implement flow offload with rte flow")
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Gaetan Rivet <gaetanr@mellanox.com>
Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Tested-by: Emma Finn <emma.finn@intel.com>
Signed-off-by: Lei Wang <leiw@mellanox.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

Revert "travis: Disable check for array of flexible structures in sparse."

This reverts commit 3c6b3a519ae6eae3da4cf7c59894b02b95cdade7.

The fix landed to Sparse main repository [1]:
b5d46df743be ("flex-array: allow arrays of unions with flexible members.")

[1] https://git.kernel.org/pub/scm/devel/sparse/sparse.git

Acked-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

AUTHORS: Update Eli Britstein <elibr@nvidia.com>

Signed-off-by: Eli Britstein <elibr@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ofp-ed-props: Fix using uninitialized padding for NSH encap actions.

OVS uses memcmp to compare actions of existing and new flows, but
'struct ofp_ed_prop_nsh_md_type' and corresponding ofpact structure has
3 bytes of padding that never initialized and passed around within OF
data structures and messages.

  Uninitialized bytes in MemcmpInterceptorCommon
    at offset 21 inside [0x7090000003f8, 136)
  WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x4a184e in bcmp (vswitchd/ovs-vswitchd+0x4a184e)
    #1 0x896c8a in ofpacts_equal lib/ofp-actions.c:9121:31
    #2 0x564403 in replace_rule_finish ofproto/ofproto.c:5650:37
    #3 0x563462 in add_flow_finish ofproto/ofproto.c:5218:13
    #4 0x54a1ff in ofproto_flow_mod_finish ofproto/ofproto.c:8091:17
    #5 0x5433b2 in handle_flow_mod__ ofproto/ofproto.c:6216:17
    #6 0x56a2fc in handle_flow_mod ofproto/ofproto.c:6190:17
    #7 0x565bda in handle_single_part_openflow ofproto/ofproto.c:8504:16
    #8 0x540b25 in handle_openflow ofproto/ofproto.c:8685:21
    #9 0x6697fd in ofconn_run ofproto/connmgr.c:1329:13
    #10 0x668e6e in connmgr_run ofproto/connmgr.c:356:9
    #11 0x53f1bc in ofproto_run ofproto/ofproto.c:1890:5
    #12 0x4ead0c in bridge_run__ vswitchd/bridge.c:3250:9
    #13 0x4e9bc8 in bridge_run vswitchd/bridge.c:3309:5
    #14 0x51c072 in main vswitchd/ovs-vswitchd.c:127:9
    #15 0x7f23a99011a2 in __libc_start_main (/lib64/libc.so.6)
    #16 0x46b92d in _start (vswitchd/ovs-vswitchd+0x46b92d)

  Uninitialized value was stored to memory at
    #0 0x4745aa in __msan_memcpy.part.0 (vswitchd/ovs-vswitchd)
    #1 0x54529f in rule_actions_create ofproto/ofproto.c:3134:5
    #2 0x54915e in ofproto_rule_create ofproto/ofproto.c:5284:11
    #3 0x55d419 in add_flow_init ofproto/ofproto.c:5123:17
    #4 0x54841f in ofproto_flow_mod_init ofproto/ofproto.c:7987:17
    #5 0x543250 in handle_flow_mod__ ofproto/ofproto.c:6206:13
    #6 0x56a2fc in handle_flow_mod ofproto/ofproto.c:6190:17
    #7 0x565bda in handle_single_part_openflow ofproto/ofproto.c:8504:16
    #8 0x540b25 in handle_openflow ofproto/ofproto.c:8685:21
    #9 0x6697fd in ofconn_run ofproto/connmgr.c:1329:13
    #10 0x668e6e in connmgr_run ofproto/connmgr.c:356:9
    #11 0x53f1bc in ofproto_run ofproto/ofproto.c:1890:5
    #12 0x4ead0c in bridge_run__ vswitchd/bridge.c:3250:9
    #13 0x4e9bc8 in bridge_run vswitchd/bridge.c:3309:5
    #14 0x51c072 in main vswitchd/ovs-vswitchd.c:127:9
    #15 0x7f23a99011a2 in __libc_start_main (/lib64/libc.so.6)

  Uninitialized value was created by an allocation of 'ofpacts_stub'
  in the stack frame of function 'handle_flow_mod'
    #0 0x569e80 in handle_flow_mod ofproto/ofproto.c:6170

This could cause issues with flow modifications or other operations.

To reproduce, some NSH tests could be run under valgrind or clang
MemorySantizer. Ex. "nsh - md1 encap over a veth link" test.

Fix that by clearing padding bytes while encoding and decoding.
OVS will still accept OF messages with non-zero padding from
controllers.

New tests added to tests/ofp-actions.at.

Fixes: 1fc11c5948cf ("Generic encap and decap support for NSH")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Jan Scheurich <jan.scheurich@ericsson.com>

bond: Fix using uninitialized 'lacp_fallback_ab_cfg' for 'bond-primary'.

's->lacp_fallback_ab_cfg' initialized down below in the code, so
we're using it uninitialized to detect if we need to get 'bond-primary'
configuration.

Found by valgrind:

Conditional jump or move depends on uninitialised value(s)
    at 0x409114: port_configure_bond (bridge.c:4569)
    by 0x409114: port_configure (bridge.c:1284)
    by 0x40F6E6: bridge_reconfigure (bridge.c:917)
    by 0x411425: bridge_run (bridge.c:3330)
    by 0x406D84: main (ovs-vswitchd.c:127)
  Uninitialised value was created by a stack allocation
    at 0x408C53: port_configure (bridge.c:1190)

Fix that by moving this code to the point where 'lacp_fallback_ab_cfg'
already initialized.  Additionally clarified behavior of 'bond-primary'
in manpages for the fallback to AB case.

Fixes: b4e50218a0f8 ("bond: Add 'primary' interface concept for active-backup mode.")
Acked-by: Jeff Squyres <jsquyres@cisco.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

Documentation: Update faq and NEWS for kernel 5.8

Update the NEWS and faq now that we will support up to Linux kernel
5.8.

Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

travis: Update kernel list as of 5.8

Update the list to more closely track the LTS releases on kernel.org.

Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

acinclude: Enable builds up to Linux 5.8

Allow building openvswitch against Linux kernels up to and including
version 5.8.

Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

datapath: use hlist_for_each_entry_rcu instead of hlist_for_each_entry

Upstream commit:
    commit 64948427a63f49dd0ce403388d232f22cc1971a8
    Author: Tonghao Zhang <xiangxia.m.yue@gmail.com>
    Date:   Thu Mar 26 04:27:24 2020 +0800

    net: openvswitch: use hlist_for_each_entry_rcu instead of hlist_for_each_entry

    The struct sw_flow is protected by RCU, when traversing them,
    use hlist_for_each_entry_rcu.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Compat fixup - OVS doesn't support lockdep_ovsl_is_held() yet

Reviewed-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

datapath: Distribute switch variables for initialization

Upstream commit:
    commit 16a556eeb7ed2dc3709fe2c5be76accdfa4901ab
    Author: Kees Cook <keescook@chromium.org>
    Date:   Wed Feb 19 22:23:09 2020 -0800

    openvswitch: Distribute switch variables for initialization

    Variables declared in a switch statement before any case statements
    cannot be automatically initialized with compiler instrumentation (as
    they are not part of any execution flow). With GCC's proposed automatic
    stack variable initialization feature, this triggers a warning (and they
    don't get initialized). Clang's automatic stack variable initialization
    (via CONFIG_INIT_STACK_ALL=y) doesn't throw a warning, but it also
    doesn't initialize such variables[1]. Note that these warnings (or silent
    skipping) happen before the dead-store elimination optimization phase,
    so even when the automatic initializations are later elided in favor of
    direct initializations, the warnings remain.

    To avoid these problems, move such variables into the "case" where
    they're used or lift them up into the main function body.

    net/openvswitch/flow_netlink.c: In function ‘validate_set’:
    net/openvswitch/flow_netlink.c:2711:29: warning: statement will never be executed [-Wswitch-unreachable]
     2711 |  const struct ovs_key_ipv4 *ipv4_key;
          |                             ^~~~~~~~

    [1] https://bugs.llvm.org/show_bug.cgi?id=44916

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

datapath: use skb_list_walk_safe helper for gso segments

Upstream commit:
    commit 2cec4448db38758832c2edad439f99584bb8fa0d
    Author: Jason A. Donenfeld <Jason@zx2c4.com>
    Date:   Mon Jan 13 18:42:29 2020 -0500

    net: openvswitch: use skb_list_walk_safe helper for gso segments

    This is a straight-forward conversion case for the new function, keeping
    the flow of the existing code as intact as possible.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

datapath: support asymmetric conntrack

Upstream commit:
    commit 5d50aa83e2c8e91ced2cca77c198b468ca9210f4
    author: aaron conole <aconole@redhat.com>
    date:   tue dec 3 16:34:13 2019 -0500

    openvswitch: support asymmetric conntrack

    the openvswitch module shares a common conntrack and nat infrastructure
    exposed via netfilter.  it's possible that a packet needs both snat and
    dnat manipulation, due to e.g. tuple collision.  netfilter can support
    this because it runs through the nat table twice - once on ingress and
    again after egress.  the openvswitch module doesn't have such capability.

    like netfilter hook infrastructure, we should run through nat twice to
    keep the symmetry.

    fixes: 05752523e565 ("openvswitch: interface with nat.")
    signed-off-by: aaron conole <aconole@redhat.com>
    signed-off-by: david s. miller <davem@davemloft.net>

Fixes: c5f6c06b58d6 ("datapath: Interface with NAT.")
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

datapath: remove another BUG_ON()

Upstream commit:
    commit 8a574f86652a4540a2433946ba826ccb87f398cc
    Author: Paolo Abeni <pabeni@redhat.com>
    Date:   Sun Dec 1 18:41:25 2019 +0100

    openvswitch: remove another BUG_ON()

    If we can't build the flow del notification, we can simply delete
    the flow, no need to crash the kernel. Still keep a WARN_ON to
    preserve debuggability.

    Note: the BUG_ON() predates the Fixes tag, but this change
    can be applied only after the mentioned commit.

    v1 -> v2:
     - do not leak an skb on error

Fixes: aed067783e50 ("openvswitch: Minimize ovs_flow_cmd_del critical section.")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

datapath: drop unneeded BUG_ON() in ovs_flow_cmd_build_info()

Upstream commit:
    commit 8ffeb03fbba3b599690b361467bfd2373e8c450f
    Author: Paolo Abeni <pabeni@redhat.com>
    Date:   Sun Dec 1 18:41:24 2019 +0100

    openvswitch: drop unneeded BUG_ON() in ovs_flow_cmd_build_info()

    All the callers of ovs_flow_cmd_build_info() already deal with
    error return code correctly, so we can handle the error condition
    in a more gracefull way. Still dump a warning to preserve
    debuggability.

    v1 -> v2:
     - clarify the commit message
     - clean the skb and report the error (DaveM)

Fixes: ccb1352e76cf ("net: Add Open vSwitch kernel components.")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

datapath: fix flow command message size

Upstream commit:
    commit 4e81c0b3fa93d07653e2415fa71656b080a112fd
    Author: Paolo Abeni <pabeni@redhat.com>
    Date:   Tue Nov 26 12:55:50 2019 +0100

    openvswitch: fix flow command message size

    When user-space sets the OVS_UFID_F_OMIT_* flags, and the relevant
    flow has no UFID, we can exceed the computed size, as
    ovs_nla_put_identifier() will always dump an OVS_FLOW_ATTR_KEY
    attribute.
    Take the above in account when computing the flow command message
    size.

Fixes: 74ed7ab9264c ("openvswitch: Add support for unique flow IDs.")
Reported-by: Qi Jun Ding <qding@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

datapath: don't call pad_packet if not necessary

Upstream commit:
    commit 61ca533c0e94104c35fcb7858a23ec9a05d78143
    Author: Tonghao Zhang <xiangxia.m.yue@gmail.com>
    Date:   Thu Nov 14 23:51:08 2019 +0800

    net: openvswitch: don't call pad_packet if not necessary

    The nla_put_u16/nla_put_u32 makes sure that
    *attrlen is align. The call tree is that:

    nla_put_u16/nla_put_u32
      -> nla_put attrlen = sizeof(u16) or sizeof(u32)
      -> __nla_put attrlen
      -> __nla_reserve attrlen
      -> skb_put(skb, nla_total_size(attrlen))

    nla_total_size returns the total length of attribute
    including padding.

Cc: Joe Stringer <joe@ovn.org>
Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

datapath: select vport upcall portid directly

Upstream commit:
    commit 90ce9f23a886bdef7a4b7a9bd52c7a50a6a81635
    Author: Tonghao Zhang <xiangxia.m.yue@gmail.com>
    Date:   Thu Nov 7 00:34:28 2019 +0800

    net: openvswitch: select vport upcall portid directly

    The commit 69c51582ff786 ("dpif-netlink: don't allocate per
    thread netlink sockets"), in Open vSwitch ovs-vswitchd, has
    changed the number of allocated sockets to just one per port
    by moving the socket array from a per handler structure to
    a per datapath one. In the kernel datapath, a vport will have
    only one socket in most case, if so select it directly in
    fast-path.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

datapath: simplify the ovs_dp_cmd_new

Upstream commit:
    commit eec62eadd1d757b0743ccbde55973814f3ad396e
    Author: Tonghao Zhang <xiangxia.m.yue@gmail.com>
    Date:   Fri Nov 1 22:23:54 2019 +0800

    net: openvswitch: simplify the ovs_dp_cmd_new

    use the specified functions to init resource.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

datapath: fix possible memleak on destroy flow-table

Upstream commit:
    commit 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1
    Author: Tonghao Zhang <xiangxia.m.yue@gmail.com>
    Date:   Fri Nov 1 22:23:52 2019 +0800

    net: openvswitch: fix possible memleak on destroy flow-table

    When we destroy the flow tables which may contain the flow_mask,
    so release the flow mask struct.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added additional compat layer fixup for WRITE_ONCE()

Reviewed-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

datapath: add likely in flow_lookup

Upstream commit:
    commit 0a3e01371db17d753dd92ec4d0fc6247412d3b01
    Author: Tonghao Zhang <xiangxia.m.yue@gmail.com>
    Date:   Fri Nov 1 22:23:51 2019 +0800

    net: openvswitch: add likely in flow_lookup

    The most case *index < ma->max, and flow-mask is not NULL.
    We add un/likely for performance.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

datapath: simplify the flow_hash

Upstream commit:
    commit 515b65a4b99197ae062a795ab4de919e6d04be04
    Author: Tonghao Zhang <xiangxia.m.yue@gmail.com>
    Date:   Fri Nov 1 22:23:50 2019 +0800

    net: openvswitch: simplify the flow_hash

    Simplify the code and remove the unnecessary BUILD_BUG_ON.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

datapath: optimize flow-mask looking up

Upstream commit:
    commit 57f7d7b9164426c496300d254fd5167fbbf205ea
    Author: Tonghao Zhang <xiangxia.m.yue@gmail.com>
    Date:   Fri Nov 1 22:23:49 2019 +0800

    net: openvswitch: optimize flow-mask looking up

    The full looking up on flow table traverses all mask array.
    If mask-array is too large, the number of invalid flow-mask
    increase, performance will be drop.

    One bad case, for example: M means flow-mask is valid and NULL
    of flow-mask means deleted.

    +-------------------------------------------+
    | M | NULL | ...                  | NULL | M|
    +-------------------------------------------+

    In that case, without this patch, openvswitch will traverses all
    mask array, because there will be one flow-mask in the tail. This
    patch changes the way of flow-mask inserting and deleting, and the
    mask array will be keep as below: there is not a NULL hole. In the
    fast path, we can "break" "for" (not "continue") in flow_lookup
    when we get a NULL flow-mask.

             "break"
                v
    +-------------------------------------------+
    | M | M |  NULL |...           | NULL | NULL|
    +-------------------------------------------+

    This patch don't optimize slow or control path, still using ma->max
    to traverse. Slow path:
    * tbl_mask_array_realloc
    * ovs_flow_tbl_lookup_exact
    * flow_mask_find

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

datapath: don't unlock mutex when changing the user_features fails

Upstream commit:
    commit 4c76bf696a608ea5cc555fe97ec59a9033236604
    Author: Tonghao Zhang <xiangxia.m.yue@gmail.com>
    Date:   Fri Nov 1 22:23:53 2019 +0800

    net: openvswitch: don't unlock mutex when changing the user_features fails

    Unlocking of a not locked mutex is not allowed.
    Other kernel thread may be in critical section while
    we unlock it because of setting user_feature fail.

Fixes: 95a7233c4 ("net: openvswitch: Set OvS recirc_id from tc chain index")
Cc: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

datapath: fix GFP flags in rtnl_net_notifyid()

Upstream commit:
    commit d4e4fdf9e4a27c87edb79b1478955075be141f67
    Author: Guillaume Nault <gnault@redhat.com>
    Date:   Wed Oct 23 18:39:04 2019 +0200

    netns: fix GFP flags in rtnl_net_notifyid()

    In rtnl_net_notifyid(), we certainly can't pass a null GFP flag to
    rtnl_notify(). A GFP_KERNEL flag would be fine in most circumstances,
    but there are a few paths calling rtnl_net_notifyid() from atomic
    context or from RCU critical sections. The later also precludes the use
    of gfp_any() as it wouldn't detect the RCU case. Also, the nlmsg_new()
    call is wrong too, as it uses GFP_KERNEL unconditionally.

    Therefore, we need to pass the GFP flags as parameter and propagate it
    through function calls until the proper flags can be determined.

    In most cases, GFP_KERNEL is fine. The exceptions are:
      * openvswitch: ovs_vport_cmd_get() and ovs_vport_cmd_dump()
        indirectly call rtnl_net_notifyid() from RCU critical section,

      * rtnetlink: rtmsg_ifinfo_build_skb() already receives GFP flags as
        parameter.

    Also, in ovs_vport_cmd_build_info(), let's change the GFP flags used
    by nlmsg_new(). The function is allowed to sleep, so better make the
    flags consistent with the ones used in the following
    ovs_vport_cmd_fill_info() call.

    Found by code inspection.

Fixes: 9a9634545c70 ("netns: notify netns id events")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Backport the datapath.c portion of this fix.

Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

datapath: Set OvS recirc_id from tc chain index

Upstream commit:
    commit 95a7233c452a58a4c2310c456c73997853b2ec46
    Author: Paul Blakey <paulb@mellanox.com>
    Date:   Wed Sep 4 16:56:37 2019 +0300

    net: openvswitch: Set OvS recirc_id from tc chain index

    Offloaded OvS datapath rules are translated one to one to tc rules,
    for example the following simplified OvS rule:

    recirc_id(0),in_port(dev1),eth_type(0x0800),ct_state(-trk) actions:ct(),recirc(2)

    Will be translated to the following tc rule:

    $ tc filter add dev dev1 ingress \
                prio 1 chain 0 proto ip \
                    flower tcp ct_state -trk \
                    action ct pipe \
                    action goto chain 2

    Received packets will first travel though tc, and if they aren't stolen
    by it, like in the above rule, they will continue to OvS datapath.
    Since we already did some actions (action ct in this case) which might
    modify the packets, and updated action stats, we would like to continue
    the proccessing with the correct recirc_id in OvS (here recirc_id(2))
    where we left off.

    To support this, introduce a new skb extension for tc, which
    will be used for translating tc chain to ovs recirc_id to
    handle these miss cases. Last tc chain index will be set
    by tc goto chain action and read by OvS datapath.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Backport the local datapath changes from this patch and add compat
layer fixup for the DECLARE_STATIC_KEY_FALSE macro.

Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

datapath: Print error when ovs_execute_actions() fails

Upstream commit:
    commit aa733660dbd8d9192b8c528ae0f4b84f3fef74e4
    Author: Yifeng Sun <pkusunyifeng@gmail.com>
    Date:   Sun Aug 4 19:56:11 2019 -0700

    openvswitch: Print error when ovs_execute_actions() fails

    Currently in function ovs_dp_process_packet(), return values of
    ovs_execute_actions() are silently discarded. This patch prints out
    an debug message when error happens so as to provide helpful hints
    for debugging.
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

datapath: do not update max_headroom if new headroom is equal to old headroom

Upstream commit:
    commit 6b660c4177aaebdc73df7a3378f0e8b110aa4b51
    Author: Taehee Yoo <ap420073@gmail.com>
    Date:   Sat Jul 6 01:08:09 2019 +0900

    net: openvswitch: do not update max_headroom if new headroom is equal to old headroom

    When a vport is deleted, the maximum headroom size would be changed.
    If the vport which has the largest headroom is deleted,
    the new max_headroom would be set.
    But, if the new headroom size is equal to the old headroom size,
    updating routine is unnecessary.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

datapath: drop unneeded likely() call around IS_ERR()

Upstream commit:
    commit b90f5aa4d6268e81dd1fd51e5ef89d2892bf040d
    Author: Enrico Weigelt <info@metux.net>
    Date:   Wed Jun 5 23:06:40 2019 +0200

    net: openvswitch: drop unneeded likely() call around IS_ERR()

    IS_ERR() already calls unlikely(), so this extra likely() call
    around the !IS_ERR() is not needed.

Signed-off-by: Enrico Weigelt <info@metux.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

datapath: return an error instead of doing BUG_ON()

Upstream commit:
    commit a734d1f4c2fc962ef4daa179e216df84a8ec5f84
    Author: Eelco Chaudron <echaudro@redhat.com>
    Date:   Thu May 2 16:12:38 2019 -0400

    net: openvswitch: return an error instead of doing BUG_ON()

    For all other error cases in queue_userspace_packet() the error is
    returned, so it makes sense to do the same for these two error cases.

Reported-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

Eliminate "whitelist" and "blacklist" terms.

There is one remaining use under datapath. That change should happen
upstream in Linux first according to our usual policy.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>

Use primary/secondary, not master/slave, as names for OpenFlow roles.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>

dpctl: Fix broken flow deletion via ovs-dpctl due to missing ufid.

Current code generates UFID for flows installed by ovs-dpctl.  This
leads to inability to remove such flows by the same command.  Ex:

  ovs-dpctl add-dp test
  ovs-dpctl add-if test vport0
  ovs-dpctl add-flow test "in_port(0),eth(),eth_type(0x800),ipv4(src=100.1.0.1)" 0
  ovs-dpctl del-flow test "in_port(0),eth(),eth_type(0x800),ipv4(src=100.1.0.1)"

  dpif|WARN|system@test: failed to flow_del (No such file or directory)
      ufid:e4457189-3990-4a01-bdcf-1e5f8b208711 in_port(0),
      eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00),eth_type(0x0800),
      ipv4(src=100.1.0.1,dst=0.0.0.0,proto=0,tos=0,ttl=0,frag=no)

  ovs-dpctl: deleting flow (No such file or directory)
  Perhaps you need to specify a UFID?

During del-flow operation UFID is generated too, however resulted
value is different from one generated during add-flow.  This happens
because odp_flow_key_hash() function uses random base value for flow
hashes which is different on every invocation.  That is not an issue
while running 'ovs-appctl dpctl/{add,del}-flow' because execution
of these requests happens in context of the OVS main process, i.e.
there will be same random seed.

Commit e61984e781e6 was intended to allow offloading for flows
added by dpctl/add-flow unixctl command, so it's better to generate
UFIDs conditionally inside dpctl command handler only for appctl
invocations.  Offloading is not possible from ovs-dpctl utility anyway.

There are still couple of corner case:  It will not be possible to
remove flow by 'ovs-appctl dpctl/del-flow' without specifying UFID if
main OVS process was restarted since flow addition and it will not
be possible to remove flow by ovs-dpctl without specifying UUID if
it was added by 'ovs-appctl dpctl/add-flow'.  But these scenarios
seems minor since these commands intended for testing only.

Reported-by: Eelco Chaudron <echaudro@redhat.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2020-September/374863.html
Fixes: e61984e781e6 ("dpif-netlink: Generate ufids for installing TC flowers")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Tested-by: Eelco Chaudron <echaudro@redhat.com>

travis: Disable check for array of flexible structures in sparse.

Sparse introduced new checks for flexible arrays and there is a
false-positive in netdev-linux implementation right now that can not
be easily fixed.  Patch sent to sparse to fix it, but we need to
disable the check for now to unblock our CI.

  lib/netdev-linux.c:1238:19: error: array of flexible structures

The issue is with the following code:

  union {
      struct cmsghdr cmsg;
      char buffer[CMSG_SPACE(sizeof(struct tpacket_auxdata))];
  } cmsg_buffers[NETDEV_MAX_BURST];

'struct cmsghdr' contains a flexible array.  But this union is a way
to ensure correct alignment of 'buffer', suggested by CMSG manpage.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

datapath: Fix exposing OVS_TUNNEL_KEY_ATTR_GTPU_OPTS to kernel module.

Kernel module doesn't know about GTPU and it should return correct
out-of-range error in case this tunnel attribute passed there for
any reason. Current out-of-tree module will pass the range check
and will try to access ovs_tunnel_key_lens[] array by index
OVS_TUNNEL_KEY_ATTR_GTPU_OPTS. Even though it might not produce
issues in current code, this is not a good thing to do since
ovs_tunnel_key_lens[] array is not explicitly initialized for
OVS_TUNNEL_KEY_ATTR_GTPU_OPTS and we will likely have misleading
error about incorrect attribute length in the end.

Fixes: 3c6d05a02e0f ("userspace: Add GTP-U support.")
Acked-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

dns-resolve: Allow unbound's config file to be set through an env var.

When an unbound context is created, check whether OVS_UNBOUND_CONF has been
set. If a valid config file is supplied then use it to configure the
context. The procedure returns if the config file is invalid. If no config
file is found then the default unbound config is used.

Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ted Elhourani <ted.elhourani@nutanix.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ofproto-dpif-upcall: Log the emergency flow flush.

When the number of flows in the datapath reaches twice the
maximum, revalidators will delete all flows as an emergency
action to recover. In that case, log a message with values
and increase a coverage counter.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ofproto-dpif-upcall: Log the value of flow limit.

The datapath flow limit is calculated by revalidators so
log the value as well.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

ovsdb-idl.at: Queue for termination all OVSDB IDL pids.

When running OVSDB cluster tests on Windows not all the ovsdb processes
are terminated. Queue up the pids of the started processes for
termination when the test stops.

Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

smap: Add smap_get_uint() helper function.

This helper function is required by OVN.

Suggested-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

system-userspace-packet-type-aware.at: Wait for ip address updates.

ovs-router module checks for the source ip address of the interface
while adding a new route.  netdev module doesn't request ip addresses
from the system every time, but instead it caches currently assigned
ip addresses and updates the cache on netlink notifications if needed.

So, there is a slight delay between setting ip address on interface
in a system and a moment OVS updates list of ip addresses of this
interface.  If route addition happens within this time frame, it
fails with the following error:

    # ovs-appctl ovs/route/add 10.0.0.0/24 br-p1
    Error while inserting route.
    ovs-appctl: ovs-vswitchd: server returned an error

This makes system tests to fail frequently.

Let's wait until local route successfully added.  This will mean
that OVS finished processing of a netlink event and will use up to
date list of ip addresses on desired interface.

Fixes: 526cf4e1d6a8 ("tests: Added unit tests in packet-type-aware.at")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>

dpif-netdev: Fix typo in copyright header.

Reported-by: David Marchand <david.marchand@redhat.com>
Fixes: 352b6c7116cd ("dpif-lookup: add avx512 gather implementation.")
Fixes: f5ace7cd8a85 ("dpif-netdev: Move dpcls lookup structures to .h")
Cc: Harry Van Haaren <harry.van.haaren@intel.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>

docs: Add flow control on i40e issue

There is an issue with flow control configuration on i40e devices
and it has a work around. We add this to documentation as known issue
until a permanent solution is developed.

Signed-off-by: Tomasz Konieczny <tomaszx.konieczny@intel.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>

DPDK: Remove support for vhost-user zero-copy.

Support for vhost-user dequeue zero-copy was deprecated in OVS 2.14 with
the aim of removing it for OVS 2.15.

OVS only supports zero copy for vhost client mode, as such it will cease
to function due to DPDK commit [1]

Also DPDK is set to remove zero-copy functionality in DPDK 20.11 as
referenced by commit [2]

As such remove support from OVS.

[1] 715070ea10e6 ("vhost: prevent zero-copy with incompatible client mode")
[2] d21003c9dafa ("doc: announce removal of vhost zero-copy dequeue")

Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Kevin Traynor <ktraynor@redhat.com>

appveyor: Bump outdated links and add artifacts

Bump OpenSSL.

Add release and debug configuration.

Build and add Windows installer to generated artifacts.

Build and zip prebuilt version.

Co-authored-by: Yonggang Luo <luoyonggang@gmail.com>
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Co-authored-by: Jinjun Gao <jinjung@vmware.com>
Signed-off-by: Jinjun Gao <jinjung@vmware.com>
Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>

windows, installer: Bundle Windows 10 driver

This patch bundles the Windows 10 driver family in the installer and also
adds detection for the family.

Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>

windows: Update build with latest pthread project

pthreads-win32 has moved too PThreads4W.

This patch updates the build steps, CI (appveyor) and documentation.

Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>

windows, installer: Bundle latest runtime version

Until now we were bundling MSVC120 x86 runtime.

This patch changes it too the latest version and also add the 64 bit version
of it.

Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>