git.proxmox.com Git - ovs.git/log

ovsdb-error: New function ovsdb_error_to_string_free().

This allows slight code simplifications across the tree.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Tested-by: Yifeng Sun <pkusunyifeng@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>

travis: Install libnuma dependency for DPDK.

libnuma is a default dependency for DPDK 17.11 because
CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES and CONFIG_RTE_LIBRTE_VHOST_NUMA
are enabled by default for most architectures.
libnuma-dev package installation fixes the DPDK build:

eal_memory.c:56:18: fatal error:
numa.h: No such file or directory

CC: Mark Kavanagh <mark.b.kavanagh@intel.com>
Fixes: 5e925ccc2a6f ("netdev-dpdk: DPDK v17.11 upgrade")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ian Stokes <ian.stokes@intel.com>
Tested-by: Ian Stokes <ian.stokes@intel.com>

travis: Unify DPDK build directory for stable/not stable releases.

Currently stable dpdk releases has 'dpdk-stable-$DPDK_VER' directory
in the tarball, but not stable has just 'dpdk-$DPDK_VER'.
This produces issues while moving from stable release to not stable
and vice versa. For example recent update to DPDK v17.11 broke the
travis build:

'dpdk-17.11.tar.gz' saved
./.travis/linux-build.sh: line 61:
cd: dpdk-stable-17.11: No such file or directory

With this change 'dpdk-$DPDK_VER' format will be used for all the
types of dpdk releases by renaming the source directory.

CC: Mark Kavanagh <mark.b.kavanagh@intel.com>
Fixes: 5e925ccc2a6f ("netdev-dpdk: DPDK v17.11 upgrade")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ian Stokes <ian.stokes@intel.com>
Tested-by: Ian Stokes <ian.stokes@intel.com>

travis: Use pip2 instead of pip for OSX build.

xcode8.3 is a new default image for OS X on Travis-CI, but
it does not have 'pip':

    pip install --user six
    ./.travis/osx-prepare.sh: line 3: pip: command not found

'pip2' or 'pip3' should be used explicitly instead:

    https://github.com/travis-ci/travis-ci/issues/8829

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>

ovsdb-idl: Tolerate initialization races for singleton tables.

By verifying that singleton tables (that is, tables that should have exactly
one row) are empty when they emit transactions that insert into them,
ovs-vsctl and similar tools tolerate initialization races, where more than one
client at a time tries to initialize a singleton table.

The upshot is that if you create a database and then run multiple ovs-vsctl
(etc.) commands against it in parallel (without first initializing it
serially), then without this patch sometimes you will sometimes get failures
but this patch avoids them.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>

ovsdb-idl: Remove 'uuid' member of struct ovsdb_idl.

This was used to uniquely identify the monitor, but there's no need for
that. A fixed monitor name works fine.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>

ovsdb-idl: Fix assertion failure on error path parsing server reply.

If the database server sent an error reply to a monitor_cond request, and
the error was not a JSON string, then passing the error to json_string()
caused an assertion failure.

Found by inspection.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>

ovsdb-idl: Fix indentation in a couple of places.

White space changes only.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>

ovsdb-idl: Improve comments.

This change documents the IDL state machine, adds other comments,
and fixes a spelling error in a comment.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>

conntrack: Fix icmp error address sanity check.

An address sanity check is done on icmp error packets to
check that the icmp error payload makes sense w.r.t. the
packet itself.

The sanity check was partially incorrect since it tried
to verify the source address of the error packet against the
original destination, which does not makes since the error
can be generated by any intermediate node.

Reported-by: wangzhike <wangzhike@jd.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/341609.html
Fixes: a489b1685 ("conntrack: New userspace connection tracker.")
CC: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: wangzhike <wangzhike@jd.com>
Co-authored-by: wangzhike <wangzhike@jd.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>

conntrack: Disable algs by default.

Presently, alg processing is enabled by default to better exercise code.
This is similar to kernels before 4.7 as well. The recommended default
behavior in the newer kernels is to only process algs if a helper is
supplied in a conntrack rule. The behavior is changed to match the
later kernels.

A test is extended to check that the control connection is still
created in such a case.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>

conntrack: Allow specified alg port numbers.

Algs can use variable control port numbers for servers.
The main use case is a kind of feeble security measure; the
thinking being by some is that it obscures the alg traffic.
It is really not very effective, but the kernel has this
capability. This patch mimics the capability.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>

conntrack: Refactor algs.

Upcoming requirements for new algs make it desirable to split out
alg helpers more cleanly.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>

dpif-netdev: Avoid "sparse" warning.

"sparse" warns when odp_port_t is used directly in an inequality
comparison. This avoids the warning.

CC: Kevin Traynor <ktraynor@redhat.com>
Fixes: a130f1a89bd8 ("dpif-netdev: Add port/queue tiebreaker to rxq_cycle_sort.")
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Ian Stokes <ian.stokes@intel.com>

ofproto: Keep inserting buckets into a group from changing group type.

The "insert buckets" and "delete buckets" operations on a group should not
change the group's type or properties, but the implementation did this by
mistake. This fixes the problem.

Reported-by: shivani dommeti <shivani.dommeti@gmail.com>
Tested-by: shivani dommeti <shivani.dommeti@gmail.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2017-December/045830.html
Signed-off-by: Ben Pfaff <blp@ovn.org>

daemon-unix: include missing help information

These options have existed for a while, but were not expressed in the
help information. Inform the user that these options exist, and give
some basic help.

Reported-by: Saravanan KR <skramaja@redhat.com>
Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Markos Chandras <mchandras@suse.de>

Merge branch 'dpdk_merge' of https://github.com/istokes/ovs into HEAD

datapath-windows: Add support for deleting conntrack entry by 5-tuple.

To delete a conntrack entry specified by 5-tuple pass an additional
conntrack 5-tuple parameter to flush-conntrack.

Signed-off-by: Anand Kumar <kumaranand@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>

doc: Correct path of kernel system tests results directory.

Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>

ofproto-dpif-xlate: Change assertion to log message.

Until now, compose_output_action__() has asserted that a packet output to
a patch port is not to be truncated. This commit changes this to an error
that will be included in trace output, for two reasons. First, this sounds
like only a minor problem to me which doesn't warrant killing the process.
Second, it will be easier to track down the actual problem (if any) if we
can get a trace instead of a segfault.

Reported-by: Kevin Lin <kevin@kelda.io>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2017-December/045832.html
Signed-off-by: Ben Pfaff <blp@ovn.org>

ofproto-dpif-xlate: Correctly decide whether truncating.

xlate_output_action() must tell some of the functions it calls whether the
packet is being truncated.  Until now, it has inferred that based on
whether its max_len argument is nonzero.

Unfortunately, max_len conflates two different purposes.  Historically it
was used only to limit the number of bytes of packets sent to an OpenFlow
controller in packet_in messages.  When packet truncation was introduced,
it was then also used to specify the truncation length.  This meant that,
for example, when xlate_output_reg_action() called into
xlate_output_action() passing along for max_len an OpenFlow controller byte
limit (which ovs-ofctl by default sets to 65535), xlate_output_action()
interpreted that as a truncation request and told the functions it called
that the packet was being truncated, which in the worst case led to
assertion failures.

This commit disentangles these two meaning of max_len, separating them into
two separate parameters, and updates the callers.

Reported-by: Kevin Lin <kevin@kelda.io>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2017-December/045841.html
Tested-by: Kevin Lin <kevin@kelda.io>
Signed-off-by: Ben Pfaff <blp@ovn.org>

ovsdb-server: Document monitor_cond_change behavior for unmentioned tables.

It seems best to be explicit about this.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>

jsonrpc-server: Report monitor session ID properly in error message.

The error message in question is about the monitor session ID but it
actually reports the JSON-RPC request ID instead, which is surprising.

Found by inspection.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>

tests: Always ignore "Broken pipe" and "Connection reset" log messages.

Until now, the ovn-controller-vtep, ovn-nbctl, and ovn-sbctl tests have
ignored "Broken pipe" and "Connection reset" messages. The same rationale
that applies to them also applies to ovs-vsctl and other utilities. It
seems easier to just always ignore them.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>

stream-unix: Give accepted sockets distinct names for log messages.

At least on Linux, when process A connects to process B over a Unix
domain socket, unless process A bound its socket to a name before
it made the connection, process B gets an empty peer name.  Until
now, OVS has just reported the name of the connection as "unix".
This is not meaningful, of course.  I do not know of a good general
solution to this problem, but this commit attempts a step in the
right direction by at least giving each connection of this kind a
number: "unix#1", "unix#2", and so on.  That way, in log messages
one can at least see which messages are related to a particular
connection.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>

test-ovsdb: Triggers should wake up other triggers immediately.

When a trigger executes, it can make changes to the database that fulfill
the conditions for some other trigger to execute. ovsdb-server implements
this properly, but the code in test-ovsdb for testing triggers outside
ovsdb-server did not. This fixes the problem.

Found by inspection.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>

test-ovsdb: Simplify code in do_trigger().

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>

netdev-dpdk: extend netdev_dpdk_get_status to include if_type and if_descr

This commit extends netdev_dpdk_get_status API to include additional
driver-related information: if_type and if_descr.

v2->v3: Code rebase.
v3->v4: Minor comments applied.
v5->v6: Adds DPDK port specific description in documentation.

Co-authored-by: Michal Weglicki <michalx.weglicki@intel.com>
Signed-off-by: Michal Weglicki <michalx.weglicki@intel.com>
Signed-off-by: Przemyslaw Szczerbik <przemyslawx.szczerbik@intel.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>

Revert "dpif_netdev: Refactor dp_netdev_pmd_thread structure."

This reverts commit a807c15796ddc43ba1ffb2a6b0bd2ad4e2b73941.

Padding and aligning of dp_netdev_pmd_thread structure members
is useless, broken in a several ways and only greatly degrades
maintainability and extensibility of the structure.

Issues:

    1. It's not working because all the instances of struct
       dp_netdev_pmd_thread allocated only by usual malloc. All the
       memory is not aligned to cachelines -> structure almost never
       starts at aligned memory address. This means that any further
       paddings and alignments inside the structure are completely
       useless. Fo example:

       Breakpoint 1, pmd_thread_main
       (gdb) p pmd
       $49 = (struct dp_netdev_pmd_thread *) 0x1b1af20
       (gdb) p &pmd->cacheline1
       $51 = (OVS_CACHE_LINE_MARKER *) 0x1b1af60
       (gdb) p &pmd->cacheline0
       $52 = (OVS_CACHE_LINE_MARKER *) 0x1b1af20
       (gdb) p &pmd->flow_cache
       $53 = (struct emc_cache *) 0x1b1afe0

       All of the above addresses shifted from cacheline start by 32B.

       Can we fix it properly? NO.
       OVS currently doesn't have appropriate API to allocate aligned
       memory. The best candidate is 'xmalloc_cacheline()' but it
       clearly states that "The memory returned will not be at the
       start of a cache line, though, so don't assume such alignment".
       And also, this function will never return aligned memory on
       Windows or MacOS.

    2. CACHE_LINE_SIZE is not constant. Different architectures have
       different cache line sizes, but the code assumes that
       CACHE_LINE_SIZE is always equal to 64 bytes. All the structure
       members are grouped by 64 bytes and padded to CACHE_LINE_SIZE.
       This leads to a huge holes in a structures if CACHE_LINE_SIZE
       differs from 64. This is opposite to portability. If I want
       good performance of cmap I need to have CACHE_LINE_SIZE equal
       to the real cache line size, but I will have huge holes in the
       structures. If you'll take a look to struct rte_mbuf from DPDK
       you'll see that it uses 2 defines: RTE_CACHE_LINE_SIZE and
       RTE_CACHE_LINE_MIN_SIZE to avoid holes in mbuf structure.

    3. Sizes of system/libc defined types are not constant for all the
       systems. For example, sizeof(pthread_mutex_t) == 48 on my
       ARMv8 machine, but only 40 on x86. The difference could be
       much bigger on Windows or MacOS systems. But the code assumes
       that sizeof(struct ovs_mutex) is always 48 bytes. This may lead
       to broken alignment/big holes in case of padding/wrong comments
       about amount of free pad bytes.

    4. Sizes of the many fileds in structure depends on defines like
       DP_N_STATS, PMD_N_CYCLES, EM_FLOW_HASH_ENTRIES and so on.
       Any change in these defines or any change in any structure
       contained by thread should lead to the not so simple
       refactoring of the whole dp_netdev_pmd_thread structure. This
       greatly reduces maintainability and complicates development of
       a new features.

    5. There is no reason to align flow_cache member because it's
       too big and we usually access random entries by single thread
       only.

So, the padding/alignment only creates some visibility of performance
optimization but does nothing useful in reality. It only complicates
maintenance and adds huge holes for non-x86 architectures and non-Linux
systems. Performance improvement stated in a original commit message
should be random and not valuable. I see no performance difference.

Most of the above issues are also true for some other padded/aligned
structures like 'struct netdev_dpdk'. They will be treated separately.

CC: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
CC: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>

netdev-dpdk: vHost IOMMU support

DPDK v17.11 introduces support for the vHost IOMMU feature.
This is a security feature, which restricts the vhost memory
that a virtio device may access.

This feature also enables the vhost REPLY_ACK protocol, the
implementation of which is known to work in newer versions of
QEMU (i.e. v2.10.0), but is buggy in older versions (v2.7.0 -
v2.9.0, inclusive). As such, the feature is disabled by default
in (and should remain so), for the aforementioned older QEMU
verions. Starting with QEMU v2.9.1, vhost-iommu-support can
safely be enabled, even without having an IOMMU device, with
no performance penalty.

This patch adds a new global config option, vhost-iommu-support,
that controls enablement of the vhost IOMMU feature:

ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=true

This value defaults to false; to enable IOMMU support, this field
should be set to true when setting other global parameters on init
(such as "dpdk-socket-mem", for example). Changing the value at
runtime is not supported, and requires restarting the vswitch daemon.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>

netdev-dpdk: DPDK v17.11 upgrade

This commit adds support for DPDK v17.11:
- minor updates to accomodate DPDK API changes
- update references to DPDK version in Documentation
- update DPDK version in travis' linux-build script
- document DPDK v17.11 virtio driver bug

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Jan Scheurich <jan.scheurich@ericsson.com>
Tested-by: Jan Scheurich <jan.scheurich@ericsson.com>
Tested-by: Guoshuai Li <ligs@dtdream.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>

dpif-netdev: Fix memory leak

Valgrind complains in test 1019 (dpctl - add-if set-if del-if):

4,850,896 (4,850,240 direct, 656 indirect) bytes in 1 blocks are
definitely lost in loss record 364 of 364
   by 0x517062: xcalloc (util.c:103)
   by 0x46CBBC: dp_netdev_set_nonpmd (dpif-netdev.c:4498)
   by 0x46CBBC: create_dp_netdev (dpif-netdev.c:1299)
   by 0x46CBBC: dpif_netdev_open (dpif-netdev.c:1337)
   by 0x472CB0: do_open (dpif.c:350)
   by 0x472E6F: dpif_create (dpif.c:404)
   by 0x472E6F: dpif_create_and_open (dpif.c:417)
   by 0x430EBC: open_dpif_backer (ofproto-dpif.c:727)
   by 0x430EBC: construct (ofproto-dpif.c:1411)
   by 0x41B714: ofproto_create (ofproto.c:539)
   by 0x40C84E: bridge_reconfigure (bridge.c:647)
   by 0x4104C5: bridge_run (bridge.c:2998)
   by 0x406FA4: main (ovs-vswitchd.c:119)

The reference count wasn't released at this earlier return.

This fix passes the test 'make check'.

Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>

dpif-netdev: Calculate rxq cycles prior to compare_rxq_cycles calls.

compare_rxq_cycles sums the latest cycles from each queue for
comparison with each other. While each comparison correctly
gets the latest cycles, the cycles could change between calls
to compare_rxq_cycle. In order to use consistent values through
each call of compare_rxq_cycles, sum the cycles before qsort is
called.

Requested-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>

dpif-netdev: Rename rxq_cycle_sort to compare_rxq_cycles.

This function is used for comparison between queues
as part of the sort. It does not do the sort itself.
As such, give it a more appropriate name.

Suggested-by: Billy O'Mahony <billy.o.mahony@intel.com>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Billy O'Mahony
Signed-off-by: Ian Stokes <ian.stokes@intel.com>

dpif-netdev: Add port/queue tiebreaker to rxq_cycle_sort.

rxq_cycle_sort is used to compare rx queues by their measured number
of cycles. In the event that they are equal, 0 could be returned.
However, it is observed that returning 0 results in a different sort
order on Windows/Linux. This is ok in practice but it causes a unit
test failure for
"1007: PMD - pmd-cpu-mask/distribution of rx queues" when running
on different OS's.

In order to have a consistent sort result across multiple OS's,
introduce a tiebreaker of port/queue.

Fixes: 655856ef39b9 ("dpif-netdev: Change rxq_scheduling to use rxq processing cycles.")
Reported-by: Alin Gabriel Serdean <aserdean@ovn.org>
Tested-by: Alin Gabriel Serdean <aserdean@ovn.org>
Co-authored-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>

netdev-dpdk: Remove uneeded call to rte_eth_dev_count().

The call to rte_eth_dev_count() was added as workaround
for rte_eth_dev_get_port_by_name() not handling cases
when there was no DPDK ports.

In versions of DPDK >= 17.02 rte_eth_dev_get_port_by_name()
does handle this case (DPDK commit f9ae888b1e19).
rte_eth_dev_count() is no longer needed so remove it.

Acked-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>

netdev-dpdk: Add comment about variables naming convention.

It'll be nice to document current naming convention for variables of
the following types used in netdev-dpdk:

* netdev
* netdev_dpdk
* netdev_rxq
* netdev_rxq_dpdk

to be sure that we will not return to chaos which was before
commit d46285a2206f ("netdev-dpdk: Consistent variable naming.").

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>

netdev-dpdk: Fix variables naming in set_admin_state function.

Function 'netdev_dpdk_set_admin_state()' was missed while fixing
variables naming according to the following convention:

    'struct netdev':'netdev'
    'struct netdev_dpdk':'dev'
    'struct netdev_rxq':'rxq'
    'struct netdev_rxq_dpdk':'rx'

Fixes: d46285a2206f ("netdev-dpdk: Consistent variable naming.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokess <ian.stokes@intel.com>

ovsdb-data: Add OVS_WARN_UNUSED_RESULT annotations to function definitions.

The function prototypes in ovsdb-data.h already have these, but it seems
more complete to have the annotation on the definitions too.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>

AUTHORS: Update email address for Thadeu Lima de Souza Cascardo.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@cascardo.eti.br>

datapath-windows: Correct endianness for deleting zone.

The zone Netlink attribute is supposed to be in network-byte order, but
the Windows code for deleting conntrack entries was treating it as
host-byte order.

Found by inspection.

Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Sairam Venugopal <vsairam@vmware.com>

dpctl: Support flush conntrack by conntrack 5-tuple

With this patch, "flush-conntrack" in ovs-dpctl and ovs-appctl accept
a conntrack 5-tuple to delete the conntrack entry specified by the 5-tuple.
For example, user can use the following command to flush a conntrack entry
in zone 5.

$ ovs-dpctl flush-conntrack zone=5 \
'ct_nw_src=10.1.1.2,ct_nw_dst=10.1.1.1,ct_nw_proto=17,ct_tp_src=2,ct_tp_dst=1'

$ ovs-appctl dpctl/flush-conntrack zone=5 \
'ct_nw_src=10.1.1.2,ct_nw_dst=10.1.1.1,ct_nw_proto=17,ct_tp_src=2,ct_tp_dst=1'

VMWare-BZ: #1983178
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>

ct-dpif,dpif-netlink: Support conntrack flush by ct 5-tuple

This patch adds support of flushing a conntrack entry specified by the
conntrack 5-tuple, and provides the implementation in dpif-netlink.
The implementation of dpif-netlink in the linux datapath utilizes the
NFNL_SUBSYS_CTNETLINK netlink subsystem to delete a conntrack entry in
nf_conntrack. Future patches will add support for the userspace and
Windows datapaths.

VMWare-BZ: #1983178
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>

dpctl: Fix comment describing get_one_dp().

Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>

tests: Use $(MKDIR_P) instead of mkdir -p.

It is more portable.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>

tests: Use $(MKDIR_P) to avoid races.

"test -d x || mkdir x" has a race when invoked in parallel: it is possible
for two processes to both see that 'x' does not exist and both try to
create it, and if that happens then one of them will fail. This avoids
the problem.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>

OVN pacemaker: Add the monitor action for Master role

Pacemaker Resource agent periodically calls the OVN OCF's "monitor" action
periodically to check the status. But the OVN OCF script doesn't add the
action "monitor" for the role "Master" because of which the pacemaker
resource agent do not call the "monitor" action at all for the master.
In case OVN db servers exit for some reason this totally gets undetected
and one of the standby node is not promoted to master.

This patch adds the monitor action for "Master" role. Also the monitor
action do not check for the status of the ovn-northd (if manage_northd is yes).
This patch also checks for the status of the ovn-northd in the monitor action
for the "Master" role. If any of the ovsdb-server or ovn-northd is not running,
monitor action will return OCF_NOT_RUNNING and this will cause the pacemaker
to restart the OVN OCF resource.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1512568
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
CC: Russell Bryant <russell@ovn.org>
Signed-off-by: Russell Bryant <russell@ovn.org>

ovn/TODO: Remove some completed items.

Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>

datapath: Fix kernel panic for uninitialized tun_dst of ovs_gso_cb.

The variable tun_dst in struct ovs_gso_cb isn't necessarily all-zeros which
came from the Netlink layer. When delete a netdev port and immediately add
a vxlan port, they maybe use the same port_no. So the variable tun_dst of
struct ovs_gso_cb hasn't be set, when the skb sent to the vxlan port. And
the panic will be triggered.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000052
IP: [<ffffffffa07954f4>] rpl_vxlan_xmit+0x34/0x60 [openvswitch]
PGD 1f9f374067 PUD 1f9f375067 PMD 0
Oops: 0000 [#1] SMP
RIP: 0010:[<ffffffffa07954f4>]  [<ffffffffa07954f4>] rpl_vxlan_xmit+0x34/0x60 [openvswitch]
RSP: 0018:ffff881fff483898  EFLAGS: 00010202
RAX: 0000000000000040 RBX: ffff881ff2d59f00 RCX: ffff881f742016b0
RDX: 0000000000000001 RSI: ffff881f9f5f0000 RDI: ffff881ff2d59f00
RBP: ffff881fff483898 R08: 000000000000002e R09: 0000000000000000
R10: 0000000000000000 R11: ffff881fff483a50 R12: ffff881f74201680
R13: 000000000000ffbe R14: 0000000000000000 R15: ffff881ff2d59f00
FS:  00007f8b6f7fe700(0000) GS:ffff881fff480000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000052 CR3: 0000001f9f373000 CR4: 00000000000027e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Call Trace:
<IRQ>
[<ffffffffa0786480>] ovs_vport_send+0xa0/0x180 [openvswitch]
[<ffffffffa077414e>] do_output+0x4e/0xf0 [openvswitch]
[<ffffffffa07758ae>] do_execute_actions+0xa6e/0xa90 [openvswitch]
[<ffffffff815b654f>] ? netlink_unicast+0x16f/0x1b0
[<ffffffff815732bb>] ? skb_zerocopy+0x1fb/0x380
[<ffffffffa07847ca>] ? flow_lookup.isra.8+0x4a/0xc0 [openvswitch]
[<ffffffffa0775b2d>] ovs_execute_actions+0x4d/0x140 [openvswitch]
[<ffffffffa077c604>] ovs_dp_process_packet+0x94/0x140 [openvswitch]
[<ffffffffa07762c4>] ? ovs_ct_update_key+0xc4/0x150 [openvswitch]
[<ffffffffa078637b>] ovs_vport_receive+0x7b/0xe0 [openvswitch]
[<ffffffffa077c604>] ? ovs_dp_process_packet+0x94/0x140 [openvswitch]
[<ffffffff816062d6>] ? __fib_validate_source.isra.13+0x2b6/0x400
[<ffffffff8158da15>] ? dst_init+0xe5/0xf0
[<ffffffffa021a2af>] ? generic_packet+0x1f/0x30 [nf_conntrack]
[<ffffffffa02160d0>] ? nf_conntrack_in+0x350/0x5f0 [nf_conntrack]
[<ffffffffa0787047>] netdev_port_receive+0xa7/0x100 [openvswitch]
[<ffffffffa07870be>] netdev_frame_hook+0x1e/0x30 [openvswitch]
[<ffffffff81581a52>] __netif_receive_skb_core+0x1e2/0x800
[<ffffffff81582088>] __netif_receive_skb+0x18/0x60
[<ffffffff81582110>] netif_receive_skb_internal+0x40/0xc0
[<ffffffff81583228>] napi_gro_receive+0xd8/0x130
[<ffffffffa04ef634>] ixgbe_clean_rx_irq+0x7c4/0xa60 [ixgbe]
[<ffffffffa04f0930>] ixgbe_poll+0x2e0/0x6c0 [ixgbe]
[<ffffffff815828b0>] net_rx_action+0x170/0x380
[<ffffffff81090b0f>] __do_softirq+0xef/0x280
[<ffffffff816ac15c>] call_softirq+0x1c/0x30
[<ffffffff8102e47d>] do_softirq+0x5d/0xb0
[<ffffffff81090ebd>] irq_exit+0x12d/0x140
[<ffffffff816accf8>] do_IRQ+0x58/0xf0
[<ffffffff816a1ced>] common_interrupt+0x6d/0x6d
<EOI>

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>

OVN: Add external_ids to NAT and Logical_Router_Static_Route tables.

The external_ids column is missing from the NAT and
Logical_Router_Static_Route tables.

Signed-off-by: Lucas Alvares Gomes <lucasagomes@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Daniel Alvarez <dalvarez@redhat.com>
Acked-by: Miguel Angel Ajo <majopela@redhat.com>

sflow: Correctly document setup command.

Reported-by: Shivaram Mysore <shivaram.mysore@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>

coding-style: Explain when to break lines before or after binary operators.

The coding style has never been explicit about this. This commit adds some
explanation of why one position or the other might be favored in a given
situation.

Suggested-by: Flavio Leitner <fbl@sysclose.org>
Suggested-at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-November/341091.html
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Tiago Lam <tiagolam@gmail.com>

odp-util: Fix another hang in NSH action parsing.

Found by libfuzzer.

Reported-by: Bhargava Shastry <bshastry@sec.t-labs.tu-berlin.de>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Jan Scheurich <jan.scheurich@ericsson.com>

lib, ovsdb: Adapt headers for C++ usage

This patch adds 'extern "C"' in a couple of header files so that
they can be compiled with C++ compilers.

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>

fedora.rst, rhel.rst: Fix broken build.

This fixes several "ERROR: Unexpected indentation" messages from the
docs-check target.

Signed-off-by: Ben Pfaff <blp@ovn.org>

RPM: Improve doc to use builddep tool.

Instead of listing all the dependencies, use the RPM group
'Development Tools' and the builddep tool to find specific
ones.

Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>

ovn-ctl: Add new commands 'run_nb_server' and 'run_sb_server'

Presently if the user wants to start OVN db servers as separate containers,
'ovn-ctl' script is not useful as '--detach' option is passed when
ovsdb-servers are started. If the container command is 'ovn-ctl
start_nb_ovsdb', the container exits as soon as ovn-ctl exits.

This patch adds two new commands - 'run_nb_server' and 'run_sb_server'. This
will be really useful for the above mentioned requirement.

Without these commands, the user may have to first generate the db by running
'ovsdb-tool' and then start the container with the command 'ovsdb-server
ovnnb_db.db ....' and this is very inconvenient.

This patch also updates the documentation in ovn-ctl.8.xml.

Suggested-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>

odp-util: Fix parsing corner case for encap_nsh() actions.

When nothing matched, the code would loop forever.

Found with libfuzzer.

Reported-by: Bhargava Shastry <bshastry@sec.t-labs.tu-berlin.de>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Jan Scheurich <jan.scheurich@ericsson.com>

netdev: netdev_get_etheraddr is not functioning as advertised.

netdev_get_etheraddr claims to clear 'mac' on error, but it fails to do so.
When looking further into both netdev_windows_get_etheraddr() and
netdev_linux_get_etheraddr(), 'mac' is also not cleared. This will lead to
usage of uninitialised ofputil_phy_port.hw_addr.

v1 -> v2: fixed a bug in v1 found by Ben, thanks Ben.

Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>

odp-execute: Add helpful comment to odp_execute_actions().

It wasn't obvious how ownership transferred to odp_execute_actions() or
to its callback.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>

odp-execute: Skip processing actions when batch is emptied

Today in OVS, when errors are encountered during the execution
of an action the entire batch of packets may be deleted (for e.g.
in processing push_tnl_action, if the port is not found in the
port_cache of PMD). The remaining actions continue to be executed
even though there are no packets to be processed.

It is assumed that the code dealing with each action checks that
the batch is not empty before executing. Crashes may occur if the
assumption is not met.

The patch makes OVS skip processing of further actions from the
action-set once a batch is emptied. Doing so centralizes the check
in one place and avoids the possibility of crashes.

This change DOES NOT fix any existing bug in the code, only a
precautionary measure to avoid crashes if new actions does not
take care of empty batches.

Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>

types: Avoid compound literals as initializers.

Older GCC can't cope.

Reported-by: Guoshuai Li <ligs@dtdream.com>
Reported-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
Reported-by: Terry Wilson <twilson@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>

cmap: Use PADDED_MEMBERS macro for cmap_bucket padding.

Current implementation of manual padding inside struct cmap_bucket
doesn't work for some cacheline sizes. For example, if CACHE_LINE_SIZE
equals to 128, compiler adds an additional 8 bytes: 4 bytes between
'hashes' and 'nodes' and 4 bytes after the manual 'pad'. This leads to
build time assertion, because sizeof(struct cmap_bucket) == 136.

Fix that by using PADDED_MEMBERS macro, which will handle all the
unexpected compiler paddings.
This is safe because we still have build time assert for the structure
size. Other possible solution is to pack the structure, but the padding
marco looks better and matches the other code.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>

ofproto-dpif-xlate: Fix bug that may leak ofproto_flow_mod

When ofm is not referenced by xc_entry, we should release its
resources by calling ofproto_flow_mod_uninit because no one is
going to use it in this function.

Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>

bfd: Fix memory leak

Valgrind complains in test 2359 ():

864 (576 direct, 288 indirect) bytes in 18 blocks are definitely
lost in loss record 96 of 101
   by 0x4A6D64: xmalloc (util.c:120)
   by 0x40BC04: gateway_chassis_get_ordered (gchassis.c:73)
   by 0x408CF0: bfd_calculate_chassis (bfd.c:219)
   by 0x408CF0: bfd_run (bfd.c:257)
   by 0x407F72: main (ovn-controller.c:718)

gateway_chassis wasn't released before the 'continue' line.

Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>

dpif: Fix memory leak

Valgrind complains in test 2322 (ovn -- 3 HVs, 3 LS, 3 lports/LS, 1 LR):

31,584 (26,496 direct, 5,088 indirect) bytes in 48 blocks are definitely
lost in loss record 422 of 427
   by 0x5165F4: xmalloc (util.c:120)
   by 0x466194: dp_packet_new (dp-packet.c:138)
   by 0x466194: dp_packet_new_with_headroom (dp-packet.c:148)
   by 0x46621B: dp_packet_clone_data_with_headroom (dp-packet.c:210)
   by 0x46621B: dp_packet_clone_with_headroom (dp-packet.c:170)
   by 0x49DD46: dp_packet_batch_clone (dp-packet.h:789)
   by 0x49DD46: odp_execute_clone (odp-execute.c:616)
   by 0x49DD46: odp_execute_actions (odp-execute.c:795)
   by 0x471663: dpif_execute_with_help (dpif.c:1296)
   by 0x473795: dpif_operate (dpif.c:1411)
   by 0x473E20: dpif_execute.part.21 (dpif.c:1320)
   by 0x428D38: packet_execute (ofproto-dpif.c:4682)
   by 0x41EB51: ofproto_packet_out_finish (ofproto.c:3540)
   by 0x41EB51: handle_packet_out (ofproto.c:3581)
   by 0x4233DA: handle_openflow__ (ofproto.c:8044)
   by 0x4233DA: handle_openflow (ofproto.c:8219)
   by 0x4514AA: ofconn_run (connmgr.c:1437)
   by 0x4514AA: connmgr_run (connmgr.c:363)
   by 0x41C8B5: ofproto_run (ofproto.c:1813)
   by 0x40B103: bridge_run__ (bridge.c:2919)
   by 0x4103B3: bridge_run (bridge.c:2977)
   by 0x406F14: main (ovs-vswitchd.c:119)

the parameter dp_packet_batch is leaked when 'may_steal' is true.

When dpif_execute_helper_cb is passed with a true 'may_steal', it
is supposed to take the ownership of dp_packet_batch and release
it when done.

Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>

execution: Fix bug that leaks ovsdb_row

If there is an error after ovsdb_rbac_insert, 'row' is leaked.
So move the existing ovsdb_row_destroy to the function end.

Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>

flow: Avoid buffer overread in parse_nsh() for malformed packet.

Found by libfuzzer.

CC: Jan Scheurich <jan.scheurich@ericsson.com>
Fixes: 7edef47b4896 ("NSH: Minor bugfixes")
Reported-by: Bhargava Shastry <bshastry@sec.t-labs.tu-berlin.de>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Jan Scheurich <jan.scheurich@ericsson.com>

types: New macros ETH_ADDR_C and ETH_ADDR64_C.

These macros expand to constants of type struct eth_addr and struct
eth_addr64, respectively, and make it more convenient to initialize or
assign to an Ethernet address object.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mark Michelson <mmichels@redhat.com>

util: Make xmalloc_cacheline() allocate full cachelines.

Until now, xmalloc_cacheline() has provided its caller memory that does not
share a cache line, but when posix_memalign() is not available it did not
provide a full cache line; instead, it returned memory that was offset 8
bytes into a cache line. This makes it hard for clients to design
structures to be cache line-aligned. This commit changes
xmalloc_cacheline() to always return a full cache line instead of memory
offset into one.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Bhanuprakash Bodireddy <Bhanuprakash.bodireddy@intel.com>
Tested-by: Bhanuprakash Bodireddy <Bhanuprakash.bodireddy@intel.com>
Tested-at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-November/341362.html

redhat: Create /etc/openvswitch/* with openvswitch as user/group

Without this commit is not possible to upgrade an openvswitch release
that includes the commit ac416a3ab2d2 (for example 2.8.0) with another release
that includes the commit ac416a3ab2d2 (for example master or 2.8.1), because
rpm changes the user/group of /etc/openvswitch to root/root, but ovsdb-server
starts with the user openvswitch and so it doesn't have permissions to write in
/etc/openvswitch/conf.db.

This patch tell rpm to use the openvswitch user and group for
/etc/openvswitch and /etc/openvswitch/default.conf.

Reported-by: Mark Michelson <mmichels@redhat.com>
CC: aaron conole <aconole@redhat.com>
Fixes: ac416a3ab2d2 ("redhat: dynamically allocate and reference ovs user")
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Tested-by: Mark Michelson <mmichels@redhat.com>

smap: Return default on failure in smap_get_int/ullong.

Currently smap_get_int/ullong doesn't check any conversion errors.
Most implementations of atoi/strtoull return 0 in case of failure.

This leads to returning zero in case of wrongly set database values.
For example, commands

ovs-vsctl set interface iface options:key=\"\"
ovs-vsctl set interface iface options:key=qwe123
ovs-vsctl set interface iface options:key=abc

will have exactly same effect as

ovs-vsctl set interface iface options:key=0

in case where 'key' is an integer option of the iface.
Can be checked with 'other_config:emc-insert-inv-prob' or other
integer 'options' and 'other_config's.

0 could be not a default and not safe value for many options and
it'll be better to return default value instead if any.

Conversion functions from 'util' library used to provide proper
error handling.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Tested-by: Jan Scheurich <jan.scheurich@ericsson.com>
Acked-by: Jan Scheurich <jan.scheurich@ericsson.com>

util: Introduce str_to_ullong() helper function.

Will be used to convert strings to unsigned long long.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Jan Scheurich <jan.scheurich@ericsson.com>

util: Check ranges on string to int/long conversion.

It's required to check ranges to avoid integer overflow because
underlying strtoll() will check only for LLONG_MIN/MAX.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Jan Scheurich <jan.scheurich@ericsson.com>

ovs-ofctl: Fix bad free in colors_parse_from_env().

OVS_COLORS variable color_str is parsed by using xstrdup and strsep,
we should free original address of the string, not used after strsep.

Signed-off-by: Lili Huang <huanglili.huang@huawei.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mark Michelson <mmichels@redhat.com>

datapath-windows: Fix possible NULL dereference in IpFragment

If we can't allocate the NBL just go to the cleanup sequence.

Found using WDK 10 static code analysis.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Shashank Ram <shashank08@gmail.com>

datapath-windows: Fix static analysis warnings around ovsInstanceListLock

Check for return value when trying to initialize ovsInstanceListLock.

Also return the status back to caller of `OvsInitIpHelper`.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Shashank Ram <shashank08@gmail.com>

datapath-windows: Fix static analysis warnings in OvsGetTcpPayloadLength

This fixes the static code analysis over the function
'OvsGetTcpPayloadLength'.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Shashank Ram <shashank08@gmail.com>

datapath-windows: Add assert to ethHdr in OvsActionMplsPush

`ethHdr` cannot be NULL because we did a partial copy before it.

Add an assert to keep the static analysis happy.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Shashank Ram <shashank08@gmail.com>

datapath-windows: Vport check RtlStringCbLengthW return value

The result of `RtlStringCbLengthW` is not currently checked and triggers
a warning using the WDK 8.1 static analysis.

This patch treats the result of `RtlStringCbLengthW`.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Shashank Ram <shashank08@gmail.com>

datapath-windows: prettify logging in iphelper

Found by inspection.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Shashank Ram <shashank08@gmail.com>

datapath-windows: Use only non executable memory

Use only non-executable memory when using MmGetSystemAddressForMdlSafe.

Introduce a new function called OvsGetMdlWithLowPriority for readability.

Found using WDK 10 static code analysis.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Shashank Ram <shashank08@gmail.com>

tunnel: Fix deletion of datapath tunnel ports in case of reconfiguration

There is an issue in OVS with tunnel deletion during the
reconfiguration of OF tunnels. If the dst_port value is changed, the
old tunnel map entry will not be deleted, because the tp_port
argument of tnl_port_map_delete() has the new dst_port setting, hence
the tunnel cannot be found in the list of tnl_port structures.

The patch corrects this mechanism by adding a new argument,
'old_odp_port' to tnl_port_reconfigure(). This value is used to
identify the datapath tunnel port which is being reconfigured. In
connection with this fix, to unify the tunnel port map handling,
odp_port value is used to search the proper port to insert and delete
tunnel map entries as well. This variable can be used instead of
tp_port, as it is unique for all datapath tunnel ports, and there is
no need to reach dst_port from netdev_tunnel_config structure.

This patch also adds a printout to check the reference counter of
a tnl_port structure in tnl-port.c. Extending OVS unit test cases to
have ref_cnt values in the expected dump. Adding new test cases to
check if packet receiving is still working in the case of OF tunnel
port deletion. Adding new test cases to check the reference counter
in case of OF tunnel deletion or reconfiguration.

Signed-off-by: Balazs Nemeth <balazs.nemeth@ericsson.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Co-authored-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>

system-stats: Include core number in the process stats.

When dumping process statistics, include the the core number the process
was last scheduled. With 'other_config:enable-statistics=true',

Before:
  {cpu="28", file_systems="/,8474624,7826220 /workspace,223835956,199394160",
  load_average="1.29,1.76,1.33", memory="65861460,27457540,3813488,1999868,0",
  process_ovs-vswitchd="4685896,17452,362920,0,383967,383967",
  process_ovsdb-server="48088,5172,60,0,384057,384057"}

After:
  {cpu="28", file_systems="/,8474624,7826308 /workspace,223835956,199394172",
  load_average="1.30,1.04,1.13", memory="65861460,27469176,3815252,1999868,0",
  process_ovs-vswitchd="4686020,17360,127380,0,148406,148406,3",
  process_ovsdb-server="48096,5212,30,0,148496,148496,4"}

eg:
      process    vsz   , rss , cputime, crashes, booted, uptime, core_id
  ovs-vswitchd="4686020,17360, 127380,      0  , 148406, 148406,  3"

Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>

process: Extend get_process_info() for additional fields.

This commit enables the fields relating to process name and the core
number the process was last scheduled. The fields will be used by keepalive
monitoring framework in future commits.

This commit also fixes the following "sparse" warning:

lib/process.c:439:16: error: use of assignment suppression and length
modifier together in gnu_scanf format [-Werror=format=].

Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>

util: Add high resolution sleep support.

This commit introduces xnanosleep() for the threads needing high
resolution sleep timeouts.

usleep() that provides microsecond granularity is deprecated and threads
wanting sub-second(ms,us,ns) granularity can use this implementation.

Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>

ovn-northd; Treat logical ports of router type as always being up

Employ the simplest possible approach to determine the state of logical
ports that connect to logical routers by hardcoding it to always up.
This is intended to be less surprising than the current approach where
router ports appear as being down (with the exception of ones linking to
gateway routers, which are bound).

Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2017-August/045202.html
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mark Michelson <mmichels@redhat.com>
Acked-by: Miguel Angel Ajo <majopela@redhat.com>

ovn-northd: Refactor logic for logical port 'up' state update

No functional change. Make it obvious that we determine the logical
port 'up' state by checking for bound chassis, and update the NB DB only
when state has not been set yet or current state is different.

Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mark Michelson <mmichels@redhat.com>
Acked-by: Miguel Angel Ajo <majopela@redhat.com>

datapath-windows: Account for VLAN tag in tunnel Decap

Decap functions for tunneling protocols do not compute
the packet header offsets correctly when there is a VLAN
tag in the L2 header. This results in incorrect checksum
computation causing the packet to be dropped.

This patch adds support to account for the VLAN tag in the
packet if its present, and makes use of the OvsExtractLayers()
function to correctly compute the header offsets for different
layers.

Testing done:
- Tested Geneve, STT, Vxlan and Gre and verified that there
  are no regressions.
- Verified that packets with VLAN tags are correctly handled
  in the decap code of all tunneling protocols. Previously,
  this would result in packet drops due to invalid checksums
  being computed.
- Verified that non-VLAN tagged packets are handled correctly.

Signed-off-by: Shashank Ram <rams@vmware.com>
Acked-by: Anand Kumar <kumaranand@vmware.com>
Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>

Update mailing list archive pointers to the current server.

Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>

odp-util: Fix buffer overread in parsing string form of ODP flows.

scan_u128() should return 0 on an error but it actually returned an errno
value in some cases, so a command like this:
ovs-appctl dpctl/add-flow 'ct_label(1/55555555555555555555555555)' ''
could cause a buffer overread.

This bug is not as severe as it may sound because the string form of ODP
flows is not used over OpenFlow or OVSDB, only through the appctl interface
that is normally used just by local system administrators and not exposed
over a network.

Reported-by: Bhargava Shastry <bshastry@sec.t-labs.tu-berlin.de>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>

tc: Fix build breakage on GCC 7 by annotating fall-through.

Open vSwitch enables the GCC 7+ option that warns about fall-through
switch statements. This commit fixes newly introduced warnings.

Fixes: d6118e628988 ("netdev-tc-offloads: Verify csum flags on dump from tc")
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Paul Blakey <paulb@mellanox.com>

OpenvSwitch logrotate: Use ctl file path as target in ovs-appctl to reset logs

Presently, logrotate script, searches for the pid files in /var/log/openvswitch
and passes the pid file name (without .pid) as target to ovs-appctl. This approach
doesn't work for OVN DB servers since the ctl files are generated as "ovnnb_db.ctl"
and "ovnsb_db.ctl". So search for the .ctl files instead and use them as target to
ovs-appctl.

Suggested-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mark Michelson <mmichels@redhat.com>

ovn-ctl: Add -vfile:info option to OVN_NB/SB_LOG options

In the RHEL environment, when OVN db servers are started using ovn-ctl,
log files are empty. Adding "-vfile:info" option to ovsdb-server is
resolving this issue. Running 'ovs-apptctl -t .. vlog/reopen" results in the
logs appearing in the log files. This issue is seen with 2.7.2.

"-vfile:info" option is passed to ovn-northd and ovn-controller when starting.
There is no harm in adding this to OVN db servers.

Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>

Remove Perl dependency.

Nothing in the OVS tree uses Perl any longer, so remove the dependency.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>

tests: Convert miscellaneous test code from Perl to Python.

Perl is unfashionable and Python is more widely available and understood,
so this commit converts one of the OVS uses of Perl into Python.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>

tests: Convert dot2pic build tool from Perl to Python.

Perl is unfashionable and Python is more widely available and understood,
so this commit converts one of the OVS uses of Perl into Python.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>

tests: Convert sodepends build tool from Perl to Python.

Perl is unfashionable and Python is more widely available and understood,
so this commit converts one of the OVS uses of Perl into Python.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>

tests: Convert soexpand build tool from Perl to Python.

Perl is unfashionable and Python is more widely available and understood,
so this commit converts one of the OVS uses of Perl into Python.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>

tests: Convert dpdkstrip utility from Perl to Python.

Perl is unfashionable and Python is more widely available and understood,
so this commit converts one of the OVS uses of Perl into Python.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>