Xu Binbin [Mon, 13 Aug 2018 02:27:42 +0000 (10:27 +0800)]
netdev-dpdk: Support the link speed of XL710
In the scenario of XL710, the link speed which stored in the table
of Interface is not 40G. Because the implementation of query of link
speed only support to 10G, the parameter 'current' will be a random
value in the scenario of higher link speed. In this case, incorrect
link speed of XL710 nic will be stored in the database.
Signed-off-by: Xu Binbin <xu.binbin1@zte.com.cn> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Ben Pfaff [Fri, 24 Aug 2018 19:25:39 +0000 (12:25 -0700)]
ofproto-dpif-trace: Make -generate send packets to controller again.
Prior to the OVS 2.9 development cycle, any flow that sent a packet to a
controller required that the flow be slow-pathed. In some cases this led
to poor performance, so OVS 2.9 made controller actions fast-pathable. As
a side effect of the change, "ovs-appctl ofproto/trace -generate" no longer
sent packets to the controller. This usually didn't matter but it broke
the Faucet tutorial, which relied on this behavior. This commit
reintroduces the original behavior and thus should fix the tutorial.
Numan Siddique [Fri, 24 Aug 2018 19:26:52 +0000 (00:56 +0530)]
ovn: Fix the issue in IPv6 Neigh Solicitation responder for router IPs
Commit [1] added a new action 'nd_na_router' to set the router bit
in the 'flags' field of the Neighbour Adv packet for router IPs.
This action was used in the router pipeline. But the logical switch
pipeline also adds the Neighbour Adv flows for router IPs but with
'nd_na' action (which the commit [1] didn't handle).
This patch fixes this by changing the action to 'nd_na_router' for
router IPs.
Without this patch, the IPv6 functionality is broken.
[1] - "c9756229ed: ovn: Set proper Neighbour Adv flag when replying
for NS request for router IP"
Signed-off-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Han Zhou <hzhou8@ebay.com>
Yunjian Wang [Mon, 27 Aug 2018 11:52:55 +0000 (19:52 +0800)]
dpctl: Fix memory leak in dp_exists().
Fixes: ffdcd110fa62 ("dpctl: Make opt_dpif_open() more general.") Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Greg Rose [Fri, 24 Aug 2018 20:36:29 +0000 (13:36 -0700)]
ofproto-dpif: Check for EBUSY as well
Guru reported that we can't create more than one geneve tunnel.
Sometimes a driver will return EBUSY as well as EEXIST for some
duplicate configurations. Check for EBUSY too.
Fixes: 7521e0cf9e ("ofproto-dpif: Let the dpif report when a ...")
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2018-August/047214.html Reported-by: Guru Shetty <guru@ovn.org> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
This patch enables GRE tests to run even if the buggy `ip`
is being used.
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Tested-by: Greg Rose <gvrose8192@gmail.com>
Yifeng Sun [Tue, 21 Aug 2018 14:42:08 +0000 (07:42 -0700)]
datapath: Add support for kernel 4.16.x & 4.17.x
Add support for kernel version up to 4.17.x. On Travis, build passed
for all kernel versions. And no new test fails are introduced by this
patch.
Cleaned up file datapath/linux/compat/include/net/ip6_fib.h which
has no effect to kernel module but brings complexity to porting.
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Tested-by: Greg Rose <gvrose8192@gmail.com>
William Tu [Tue, 21 Aug 2018 23:03:14 +0000 (16:03 -0700)]
erspan: add big endian bit fields.
Big-endian systems arrange bit fields in the opposite order.
The patch follows the linux kernel's approach by defining the
big and little endian bit-field of ERSPAN header using #ifdef.
Tested on zelenka.debian.org
(https://db.debian.org/machines.cgi?host=zelenka).
Tested-by: Ben Pfaff <blp@ovn.org> Reported-by: James Page <james.page@canonical.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2018-August/351382.html Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Tue, 21 Aug 2018 16:22:03 +0000 (09:22 -0700)]
tests: Fix hash function dependencies in "tunnel - ERSPAN v1/v2 metadata".
This test only worked if each OpenFlow port was assigned a particular
datapath port number: p1 to port 3, p2 to port 2, p3 and p4 to port 1.
This happened consistently on little-endian architectures because of the
use of a particular hash function, but on big-endian architectures it
failed because the hash function was different.
This commit fixes the problem by adding the non-dummy ports separately.
(Dummy ports try to take the datapath port number corresponding to their
name, when it is available.) This does result in swapping a couple of
datapaths port numbers, so that p1 has port 1, p2 has port 2, and the
erspan ports have port 3, hence the size of the patch.
Reported-by: James Page <james.page@canonical.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2018-August/351382.html Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: William Tu <u9012063@gmail.com>
Aaron Conole [Wed, 8 Aug 2018 14:36:10 +0000 (10:36 -0400)]
ovn-ctl: allow configuring user:group for daemons
Add two options, one for controlling the ovs daemon user/group, and the
other for controlling the ovn daemon user/group. This allows a fine-grained
split between OVN and OVS daemons, and keeps the syntax and user/group
separation from ovs-ctl when running ovn-ctl.
Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Martin Xu [Mon, 20 Aug 2018 21:24:05 +0000 (14:24 -0700)]
rhel: support kmod build against mulitple kernel versions, fedora
This patch ports changes from kmod rhel6 spec file to fedora spec file,
to support packaging kernel modules built against multiple versions of
kernel sources.
RHEL 7.4 introduced backward incompatible changes in the kernel. As
a result, prebuilt PRM packages against kernels newer than 693.17.1
will cannot be used on systems with older kernels, vice versa.
Intended to work only on RHEL 7.4 (kernel version 3.10.0-693.yy.zz).
This patch allows multiple kernel version numbers delimited by
whitespace to be passed as variable "kversion". The result RPM packages
the kernel module .ko files from all specified kernel versions. For
example,
make rpm-fedora-kmod \
RPMBUILD_OPT='-D "kversion 3.10.0-693.1.1.el7.x86_64 \
3.10.0-693.17.1.el7.x86_64"'
By default, make tries to build against the current running kernel.
This patch also includes a script to update the weak-update symlinks
if the system kernel version is upgraded or downgraded after
openvswitch-kmod is installed.
Signed-off-by: Martin Xu <martinxu9.ovs@gmail.com> CC: Greg Rose <gvrose8192@gmail.com> CC: Flavio Leitner <fbl@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Flavio Leitner <fbl@redhat.com>
Han Zhou [Mon, 20 Aug 2018 05:27:31 +0000 (22:27 -0700)]
ovn-northd: Support learning neighbor from ARP request.
Current LR dynamic ARP learning support only ARP responses. If a
IP-MAC binding is learned, it will not get updated even if a host
send a GARP *request* to inform the new binding. This patch supports
learning neighbor changes from ARP requests, including GARP requests.
Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Han Zhou [Mon, 20 Aug 2018 05:27:30 +0000 (22:27 -0700)]
ovn-northd: LR respond ARP from valid subnet only.
Currently ovn LR datapath responds ARP requests even if the ARP
requestor's src IP doesn't belong to the LR port's subnets. This
may generate unnecessary ARP responses and there could also be
security concerns. This patch restricts the ARP response only if
the requestor's IP matches the LR port's subnets.
Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Sat, 18 Aug 2018 17:17:37 +0000 (10:17 -0700)]
netdev-linux: Avoid division by 0 if kernel reports bad scheduler data.
If the kernel reported a value of 0 for the second value in
/proc/net/psched, it would cause a division-by-zero fault in
read_psched(). I don't know of a kernel that would actually do that, but
it's still better to be safe.
Found by clang static analyzer.
Reported-by: Bhargava Shastry <bshastry@sect.tu-berlin.de> Signed-off-by: Ben Pfaff <blp@ovn.org> Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Ben Pfaff [Mon, 6 Aug 2018 21:36:34 +0000 (14:36 -0700)]
ovsdb-client: Make "wait" command logging more sensible.
The "wait" command in ovsdb-client (which was introduced as part of the
clustering support) fairly often logs things that are normal for it but
in other circumstances might be cause for concern, for example messages
about being unable to connect to a remote. Until now, it has tried to
suppress some of those itself by raising log levels. Unfortunately, in
some cases this had the opposite effect because it overrode any settings on
the command line, such as an attempt in ovsdb-cluster.at to suppress all
logging related to the timeval module. This commit drops the special
log levels from the "wait" command and puts equivalents into the tests
themselves.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Justin Pettit <jpettit@ovn.org>
Ben Pfaff [Wed, 15 Aug 2018 21:57:13 +0000 (14:57 -0700)]
ofp-actions: Avoid assertion failure for clone(ct(...bad actions...)).
decode_NXAST_RAW_CT() temporarily pulls data off the beginning of its
ofpacts output ofpbuf and, on its error path, fails to push it back on.
At a higher layer, decode_NXAST_RAW_CLONE() asserts, via
ofpact_finish_CLONE(), that the ofpact_clone that it put is still in the
place where it put it, which causes an assertion failure.
The root cause here is the failure to re-push the clone header. One could
fix that, but it would be pretty easy for that to go wrong again on some
other obscure error path. Instead, this commit just makes the problem go
away by always saving and restoring 'ofpact->data' if a decode fails.
Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=9862 Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Justin Pettit <jpettit@ovn.org>
Justin Pettit [Fri, 17 Aug 2018 19:48:54 +0000 (12:48 -0700)]
dpif-netlink: Prevent abort in probe_broken_meters().
Commit 92d0d515d ("dpif-netlink: Probe for broken Linux meter
implementations.") introduced a deadlock on the 'once' structure
declared in probe_broken_meters() with the following callstack:
Yi-Hung Wei [Fri, 17 Aug 2018 09:05:09 +0000 (02:05 -0700)]
dpif-netlink: Implement conntrack zone limit
This patch provides the implementation of conntrack zone limit
in dpif-netlink. It basically utilizes the netlink API to
communicate with OVS kernel module to set, delete, and get conntrack
zone limit.
Yi-Hung Wei [Fri, 17 Aug 2018 09:05:07 +0000 (02:05 -0700)]
dpif: Support conntrack zone limit.
This patch defines the dpif interface to support conntrack
per zone limit. Basically, OVS users can use this interface
to set, delete, and get the conntrack per zone limit for various
dpif interfaces. The following patch will make use of the proposed
interface to implement the feature.
Currently, nf_conntrack_max is used to limit the maximum number of
conntrack entries in the conntrack table for every network namespace.
For the VMs and containers that reside in the same namespace,
they share the same conntrack table, and the total # of conntrack entries
for all the VMs and containers are limited by nf_conntrack_max. In this
case, if one of the VM/container abuses the usage the conntrack entries,
it blocks the others from committing valid conntrack entries into the
conntrack table. Even if we can possibly put the VM in different network
namespace, the current nf_conntrack_max configuration is kind of rigid
that we cannot limit different VM/container to have different # conntrack
entries.
To address the aforementioned issue, this patch proposes to have a
fine-grained mechanism that could further limit the # of conntrack entries
per-zone. For example, we can designate different zone to different VM,
and set conntrack limit to each zone. By providing this isolation, a
mis-behaved VM only consumes the conntrack entries in its own zone, and
it will not influence other well-behaved VMs. Moreover, the users can
set various conntrack limit to different zone based on their preference.
The proposed implementation utilizes Netfilter's nf_conncount backend
to count the number of connections in a particular zone. If the number of
connection is above a configured limitation, ovs will return ENOMEM to the
userspace. If userspace does not configure the zone limit, the limit
defaults to zero that is no limitation, which is backward compatible to
the behavior without this patch.
The following high leve APIs are provided to the userspace:
- OVS_CT_LIMIT_CMD_SET:
* set default connection limit for all zones
* set the connection limit for a particular zone
- OVS_CT_LIMIT_CMD_DEL:
* remove the connection limit for a particular zone
- OVS_CT_LIMIT_CMD_GET:
* get the default connection limit for all zones
* get the connection limit for a particular zone
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Justin Pettit <jpettit@ovn.org>
Yi-Hung Wei [Fri, 17 Aug 2018 09:05:04 +0000 (02:05 -0700)]
datapath: compat: Introduce static key support
Static keys allow the inclusion of seldom used features in
performance-sensitive fast-path kernel code, via a GCC feature and a
code patching technique. For more information:
* https://www.kernel.org/doc/Documentation/static-keys.txt
Since upstream ovs kernel module now uses some static key API that was
introduced in v4.3 kernel, we shall backport them to the compat module
for older kernel supprots.
This backport is based on upstream net-next commit 11276d5306b8
("locking/static_keys: Add a new static_key interface").
Yi-Hung Wei [Fri, 17 Aug 2018 09:05:03 +0000 (02:05 -0700)]
datapath: compat: Backports nf_conncount
This patch backports the nf_conncount backend that counts the number
of connections matching an arbitrary key. The following patch will
use the feature to support connection tracking zone limit in ovs
kernel datapath.
This backport is based on an upstream net-next upstream commits. 5c789e131cbb ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search") 34848d5c896e ("netfilter: nf_conncount: Split insert and traversal") 2ba39118c10a ("netfilter: nf_conncount: Move locking into count_tree()") 976afca1ceba ("netfilter: nf_conncount: Early exit in nf_conncount_lookup() and cleanup") cb2b36f5a97d ("netfilter: nf_conncount: Switch to plain list") 2a406e8ac7c3 ("netfilter: nf_conncount: Early exit for garbage collection") b36e4523d4d5 ("netfilter: nf_conncount: fix garbage collection confirm race") 21ba8847f857 ("netfilter: nf_conncount: Fix garbage collection with zones") 5e5cbc7b23ea ("netfilter: nf_conncount: expose connection list interface") 35d8deb80c30 ("netfilter: conncount: Support count only use case") 6aec208786c2 ("netfilter: Refactor nf_conncount") d384e65f1e75 ("netfilter: return booleans instead of integers") 625c556118f3 ("netfilter: connlimit: split xt_connlimit into front and backend")
The upstream nf_conncount has a couple of export functions while
this patch only export the ones that ovs kernel module needs.
Yi-Hung Wei [Fri, 17 Aug 2018 09:05:02 +0000 (02:05 -0700)]
compat: Backport nf_ct_netns_{get, put}()
This patch backports nf_ct_netns_get/put() in order to support a feature
in the follow up patch.
nf_ct_netns_{get,put} were first introduced in upstream net-next commit ecb2421b5ddf ("netfilter: add and use nf_ct_netns_get/put") in kernel
v4.10, and then updated in commmit 7e35ec0e8044 ("netfilter: conntrack:
move nf_ct_netns_{get,put}() to core") in kernel v4.15. We need to
invoke nf_ct_netns_get/put() when the underlying nf_conntrack_l3proto
supports net_ns_{get,put}().
Therefore, there are 3 cases that we need to consider.
1) Before nf_ct_{get,put}() is introduced.
We just mock nf_ct_nets_{get,put}() and do nothing.
2) After 1) and before v4.15
Backports based on commit 7e35ec0e8044 .
Yifeng Sun [Thu, 16 Aug 2018 16:52:55 +0000 (09:52 -0700)]
porting: Add fixes to support kernel 4.15.x
This patch enables OVS kernel module to run on kernel 4.15.x.
Two conntrack-related tests failed:
- conntrack - multiple zones, local
- conntrack - multi-stage pipeline, local
This might be due to conntrack policy changes for packets coming
from local ports on kernel 4.15. More survey will be done later.
Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Co-authored-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Tested-by: Gregory Rose <gvrose8192@gmail.com> Reviewed-by: Gregory Rose <gvrose8192@gmail.com>
Ben Pfaff [Wed, 15 Aug 2018 22:03:43 +0000 (15:03 -0700)]
ofp-ed-props: Fix hang for crafted OpenFlow encap/decap properties.
decode_ed_prop() accepted encap/decap properties with a reported length of
0, without consuming any data from the property list, which yielded an
infinite loop.
Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=9918 Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Darrell Ball<dlu998@gmail.com>
Ben Pfaff [Tue, 14 Aug 2018 18:31:46 +0000 (11:31 -0700)]
ovsdb-idl: Adjust indexes during transactions.
When transactions modified tables with indexes, the indexes were not
properly updated to reflect the changes. For deleted rows, in particular,
this could cause use-after-free errors.
This commit fixes the problem and adds some simple test cases provided by
Han Zhou that, without the fix, cause a crash.
Reported-by: Han Zhou <zhouhan@gmail.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2018-August/047185.html Signed-off-by: Ben Pfaff <blp@ovn.org>
Justin Pettit [Thu, 9 Aug 2018 00:31:17 +0000 (17:31 -0700)]
dpif-netlink: Probe for broken Linux meter implementations.
Meter support was introduced in Linux 4.15. In some versions of Linux
4.15, 4.16, and 4.17, there was a bug that never set the id when the
meter was created, so all meters essentially had an id of zero. This
commit adds a probe to check for that condition and disable meters on
those kernels.
Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Justin Pettit [Wed, 8 Aug 2018 02:51:26 +0000 (19:51 -0700)]
dpif: Don't pass in '*meter_id' to meter_set commands.
The original intent of the API appears to be that the underlying DPIF
implementaion would choose a local meter id. However, neither of the
existing datapath meter implementations (userspace or Linux) implemented
that; they expected a valid meter id to be passed in, otherwise they
returned an error. This commit follows the existing implementations and
makes the API somewhat cleaner.
Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Yifeng Sun [Wed, 15 Aug 2018 13:24:50 +0000 (06:24 -0700)]
system-traffic: Add 5 new tunnel tests that don't need native linux modules
Introduce 5 new tests that don't require native gre or erspan tunnels but
sends simulated raw packets.
These tests are supposed to only run for kernel version from 3.10.x to 4.15.x
where compatible gre is being used by OVS kernel module.
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Tested-by: Greg Rose <gvrose8192@gmail.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Yifeng Sun [Wed, 15 Aug 2018 13:24:49 +0000 (06:24 -0700)]
system-traffic: Skip 5 tunnel tests on certain kernel versions
Skip gre, erspan and ip6erspan related tests on kernel version from 3.10.x
to 4.15.x because compatible gre is used and these tests will always fail.
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Tested-by: Greg Rose <gvrose8192@gmail.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Yifeng Sun [Wed, 15 Aug 2018 13:24:48 +0000 (06:24 -0700)]
tests: Add two m4 functions to skip tests for certain kernel versions
Some tests depend on native Linux gre modules to setup testing environments.
However, some kernel versions require OVS to use compatible gre modules. In
this case, these tests always fail.
This patch helps to skip a test if it fails due to this reason. The new m4
functions will be used by later patches.
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Tested-by: Greg Rose <gvrose8192@gmail.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Yifeng Sun [Wed, 15 Aug 2018 13:24:47 +0000 (06:24 -0700)]
ip6_gre: Fix a bug that clears address bits
In compatible gre module, skb->cb is solely used as ovs_gso_cb.
However, IPCB(skb) also points to skb->cb. IPCB(skb)->flags overlaps
with ovs_gso_cb.tun_dst. As a result, this bug clears the 16-23 bit
in the address of ovs_gso_cb.tun_dst and causes kernel to crash.
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Tested-by: Greg Rose <gvrose8192@gmail.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Darrell Ball [Fri, 10 Aug 2018 15:56:17 +0000 (08:56 -0700)]
dpctl: Make opt_dpif_open() more general.
By making opt_dpif_open() more general, it can be used effectively
by all potential callers and avoids trying to open potentially bogus
datapaths provided by the user. Also, the error handling is improved by
reducing bogus errors and having more specific real errors.
Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Daniel Alvarez [Mon, 13 Aug 2018 12:07:45 +0000 (14:07 +0200)]
netdev: Retry getting interfaces on inconsistent dumps from kernel
This patch in glibc [0] is fixing a bug where we may be getting
inconsistent dumps from the kernel when listing interfaces due to
a race condition.
This could happen if we try to retrieve them while interfaces are
being added/removed from the system at the same time.
For systems running against old glibc versions, this patch is retrying
the operation up to 3 times and then proceeding by logging a
warning.
Note that 3 times should be enough to not delay the operation much
and since it's unlikely that we hit the race condition 3 times in
a row. Still, if this happened, this patch is not changing the
current behavior.
ofproto-dpif-upcall: Fix for flow limit issue in revalidator
When the revalidator thread takes a long time to dump data path
flows (e.g. due to busy CPU), it reduces the maximum limit for
new flows that can be added. This results in more upcalls for
packets which do not find data path flows and temporarily reduces
overall throughput. When the situation improves and the revalidator
gets enough CPU cycles, it should increase the flow limit allowing
more flows to get inserted.
Currently the flow limit does not increase if the existing number of
flows is less than 2000 and does not allow any new flows due to
incorrect condition check. This results in a permanent drop in
performance in OVS with no automatic recovery.
This patch fixes the conditional check for increasing flow limit.
Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ilya Maximets [Tue, 14 Aug 2018 07:53:16 +0000 (10:53 +0300)]
tests: Use environment variable for default timeout.
Introduce new 'OVS_CTL_TIMEOUT' environment variable
that, if set, will be used as a default timeout for
OVS control utilities. Setting it in 'atlocal.in' will
cover all the hangs inside the testsuite, even when
utils called in a subshell.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ilya Maximets [Tue, 14 Aug 2018 07:53:15 +0000 (10:53 +0300)]
utilities: Fix and unify parsing of timeout option.
Parsing of the '--timeout' option implemented differently
for every single control utility and, which is more
important, highly inaccurate. In most cases unsigned result
of 'strtoul' stored in signed variable. Parsing failures are
not tracked. 'ovs-appctl' even uses just 'atoi' without any
checking of the argument or result.
This patch unifies the parsing by using 'str_to_uint'.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Fri, 15 Jun 2018 22:11:10 +0000 (15:11 -0700)]
test-unixctl.py: Don't suppress exceptions.
A user reported a failure of test 2364 "vlog - RFC5424 facility - Python2"
with an exit code that says that the test-unixctl process died from an
uncaught exception. Unfortunately the exception didn't show up in the log.
This commit should make the exception show up (it deletes some boilerplate
we use in our Python-based daemons to make them restart themselves on
failure, which isn't needed or appropriate for a test script).
Ben Pfaff [Fri, 15 Jun 2018 22:11:09 +0000 (15:11 -0700)]
ovsdb-server: Don't log closing session at program termination.
When ovsdb-server closes a remote connection, it logs a message about it
that includes the reason. Until now this has included sessions that it
closes when it exits. That meant that, when --run was used, there was a
race between noticing that the subprocess exited and noticing that the
session that that subprocess (presumably) had open had been closed. If
it noticed the latter first, nothing was logged (because it didn't log
anything if a session was closed in the ordinary way by the client). If
it noticed the former first, it logged a message about closing the session
itself.
This is a benign race that causes no real problems--except that the tests
didn't expect to see the log message from the former case and fail with
errors like the following:
1826. ovsdb-server.at:92: testing truncating database log with bad transaction ...
./ovsdb-server.at:96: ovsdb-tool create db schema
stderr:
stdout:
./ovsdb-server.at:104: ovsdb-server --remote=punix:socket db --run="sh txnfile"
--- /dev/null 2018-04-24 08:50:58.769000000 +0000
+++ /root/openvswitch-2.9.2/rpm/rpmbuild/BUILD/openvswitch-2.9.2/tests/testsuite.dir/at-groups/1826/stderr 2018-05-29 14:29:56.529257295 +0000
@@ -0,0 +1,2 @@
+2018-05-29T14:29:56Z|00001|ovsdb_jsonrpc_server|INFO|unix#0: disconnecting (removing ordinals database due to server termination)
+2018-05-29T14:29:56Z|00002|ovsdb_jsonrpc_server|INFO|unix#0: disconnecting (removing _Server database due to server termination)
This fixes the race. This particular log message isn't too useful since
it's pretty obvious that ovsdb-server is closing those sessions, since
after all it's exiting!
Ben Pfaff [Thu, 14 Jun 2018 21:49:23 +0000 (14:49 -0700)]
configure: Enable GCC relevant new 8.x warning options.
These don't trigger any new actual warnings in my own build.
GCC 8.x adds other new warning options that are enabled by -Wall or
-Wextra. This commit doesn't explicitly enable those because OVS already
enables -Wall and -Wextra.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Aaron Conole <aconole@bytheb.org>
liucheng (J) [Tue, 14 Aug 2018 04:08:27 +0000 (04:08 +0000)]
ofproto: Fix coredump in ofproto_destroy__().
There is a coredump when I add and delete bridges. When the rcu thread call
ofproto_destroy__, the main thread may call ofproto_create. But the
ofproto_destroy__ fun doesn't have the ofproto_mutex when access the
all_ofprotos.
#0 0x00007f824aa0d197 in raise () from /usr/lib64/libc.so.6
#1 0x00007f824aa0e888 in abort () from /usr/lib64/libc.so.6
#2 0x0000000000658249 in PAT_abort ()
#3 0x000000000065538d in patchIllInsHandler ()
#4 <signal handler called>
#5 0x0000000000478a5b in hmap_remove (node=0x3320150, hmap=0x95fc40 <all_ofprotos>) at include/openvswitch/hmap.h:287
#6 ofproto_destroy__ (ofproto=0x3320150) at ofproto/ofproto.c:1642
#7 0x0000000000535e46 in ovsrcu_call_postponed () at lib/ovs_rcu.c:323
#8 0x0000000000536014 in ovsrcu_postpone_thread (arg=<optimized out>) at lib/ovs_rcu.c:338
#9 0x0000000000538488 in ovsthread_wrapper (aux_=<optimized out>) at lib/ovs_thread.c:682
#10 0x00007f824c130dc5 in start_thread () from /usr/lib64/libpthread.so.0
#11 0x00007f824aacf7bd in clone () from /usr/lib64/libc.so.6
Signed-off-by: Cheng Liu <liucheng11@huawei.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Han Zhou [Mon, 13 Aug 2018 17:48:03 +0000 (10:48 -0700)]
ovsdb-idl: Track changes for table references.
If a change of a row is tracked, make sure the rows that reference
this row are also added in tracked changes, unless change tracking
is not required for those rows.
Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Han Zhou [Mon, 13 Aug 2018 17:48:02 +0000 (10:48 -0700)]
ovsdb-idlc.in: Support more interfaces for passing pointers of individual tables.
This is a follow-up patch for commit 0eb1e37c, to add more interfaces
that supports passing around pointers of individual tables, which will
be used in incremental processing.
Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Han Zhou [Mon, 13 Aug 2018 17:48:01 +0000 (10:48 -0700)]
ovsdb-idl.c: Update conditions when handling change tracking list.
There are places with the pattern:
if (!ovs_list_is_empty(&row->track_node)) {
ovs_list_remove(&row->track_node);
}
ovs_list_push_back(&row->table->track_list,
&row->track_node);
It seems to be trying to prevent double removing the node from a list,
but actually it doesn't, and may be misleading.
The function ovs_list_is_empty() returns true only if the node has been
initialized with ovs_list_init(). If a node is deleted but not
initialized, the check will return false and the above code will continue
deleting it again. But if a node is already initialized, calling
ovs_list_remove() is a no-op. So the check is not necessary and misleading.
In fact there is already a double removal resulted by this code: the function
ovsdb_idl_db_clear() removes the node but it then calls ovsdb_idl_row_destroy()
immediately which removes the node again. It should not result in any real
issue yet since the list was not changed so the second removal just
assigns the pointers with same value.
It is in fact not necessary to remove and then add back to the list, because
the purpose of the change tracking is just to tell if a row is changed
or not. So this patch removes the "check and remove" code before adding
the node to a list, but instead, adding it to the list only if it is
not in a list. This way it ensures the node is added to a list only once
in one idl loop. (ovsdb_idl_db_track_clear() will be called in each
iteration and the check ovs_list_is_empty() will return true at the first
check in next iteration).
Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Han Zhou [Mon, 13 Aug 2018 17:48:00 +0000 (10:48 -0700)]
ovsdb-idl: Remove a misleading comment for change tracking.
The comment was added when the feature was introduced but what it
described is not what is implemented, probably because of revisions
after code reviews.
Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Numan Siddique [Tue, 7 Aug 2018 11:38:11 +0000 (17:08 +0530)]
python jsonrpc: Allow jsonrpc_session to have more than one remote.
Python IDL implementation doesn't have the support to connect to the
cluster dbs. This patch adds this support. We are still missing the
support in python idl class to connect to the cluster master. That
support will be added in an upcoming patch.
This patch is similar to the commit 8cf6bbb184 which added multiple remote
support in the C jsonrpc implementation.
Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Numan Siddique [Tue, 7 Aug 2018 11:37:58 +0000 (17:07 +0530)]
ovs python: ovs.stream.open_block() returns success even if the remote is unreachable
The python function ovs.socket_util.check_connection_completion() uses select()
(provided by python) to monitor the socket file descriptor. The select()
returns 1 when the file descriptor becomes ready. For error cases like -
111 (Connection refused) and 113 (No route to host) (POLLERR), ovs.poller._SelectSelect.poll()
expects the exceptfds list to be set by select(). But that is not the case.
As per the select() man page, writefds list will be set for POLLERR.
Please see "Correspondence between select() and poll() notifications" section of select(2)
man page.
Because of this behavior, ovs.socket_util.check_connection_completion() returns success
even if the remote is unreachable or not listening on the port.
This patch fixes this issue by using poll() to check the connection status similar to
the C implementation of check_connection_completion().
A new function 'get_system_poll() is added in ovs/poller.py which returns the
select.poll() object. If select.poll is monkey patched by eventlet/gevent, it
gets the original select.poll() and returns it.
The test cases added in this patch fails without the fix.
Suggested-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mark Michelson <mmichels@redhat.com>
Greg Rose [Mon, 13 Aug 2018 23:00:16 +0000 (16:00 -0700)]
compat: Substitute more dependable define
The compat layer ip_tunnel_get_stats64 function was checking for the
Linux kernel version to determine if the return was void or a pointer.
This is not very reliable and caused compile warnings on SLES 12 SP3.
In acinclude.m4 create a more reliable method of determining when to
use a void return vs. a pointer return.
Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Han Zhou [Sat, 11 Aug 2018 03:29:46 +0000 (20:29 -0700)]
ovsdb-idl.c: Fix IDL index problem when rows are updated.
In current IDL index code it doesn't updated index when handling
"update2" messages, which is the default case. The consequence
is that when a row is updated, the index is not updated accordingly,
and even worse, it causes crash when calling ovsdb_idl_destroy().
It can be easily reproduced by the test cases added in this patch.
Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
rhel: Install the network scripts in a new subpackage
Starting from Fedora 29, the legacy network scripts are installed in
the "network-scripts" package and so the network scripts ("ifup-ovs",
"ifdown-ovs") should be installed only when the "network-scripts" package
is installed.
This commit introduces (on Fedora 29+) a new subpackage
(network-scripts-openvswitch). This subpackage is installed, by default, only
if the "network-scripts" package is installed too (reverse weak dependency).
Kevin Traynor [Thu, 9 Aug 2018 15:13:58 +0000 (16:13 +0100)]
vswitch.xml: Update dpdk-init documentation.
dpdk-init is now a string. Add description of 'true' and 'try'.
Fixes: 3e52fa5644cd ("dpdk: reflect status and version in the database") Cc: aconole@redhat.com Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Ben Pfaff [Tue, 7 Aug 2018 18:18:56 +0000 (11:18 -0700)]
ofproto-dpif-xlate: Improve log message.
Until now, the bridge name was at the end of the log message, after the
flow, which made it easy to miss. This commit moves it before the flow
where it is easier to spot.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Flavio Leitner <fbl@sysclose.org>
Instead, it must be configured as two different commands,
"
ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk \
options:dpdk-devargs=0000:05:00.1
ovs-vsctl set Interface dpdk0 options:rx-flow-ctrl=true
"
The DPDK ixgbe driver is now validating all the 'rte_eth_fc_conf' fields before
trying to configuring the dpdk ethdev. Hence OVS can no longer set the
'dont care' fields to just '0' as before. This commit make sure all the
'rte_eth_fc_conf' fields are populated with default values before the dev
init.
Also to avoid read error on unsupported ports, the flow control parameters
are now read only when user is trying to configure/update it.
Signed-off-by: Sugesh Chandran <sugesh.chandran@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Aaron Conole [Wed, 8 Aug 2018 00:34:52 +0000 (20:34 -0400)]
table: fix html buffer output
Prior to this commit, html output exhibits a doppler effect for
content by continually printing strings passed from
table_print_html_cell.
Fixes: cb139fa8b3a1 ("table: New function table_format() for formatting a table as a string.") Cc: Ben Pfaff <blp@ovn.org> Cc: Jakub Sitnicki <jsitnicki@gmail.com> Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Aaron Conole [Wed, 8 Aug 2018 00:34:51 +0000 (20:34 -0400)]
table: append newline when printing tables
With commit cb139fa8b3a1 ("table: New function table_format() for
formatting a table as a string.") a new mechanism for formatting
tables was introduced, and the table_print method was refactored to
use this.
During that refactor, calls to 'puts' were replaced with
'ds_put_cstr', and table print was changed to use 'fputs(...,
stdout)'. Unfortunately, fputs() does not append a newline to the
string provided, and changes the output strings of, for example,
ovsdb-client dump to print all on one line. This means
post-processing scripts that are chained after ovsdb-client would
either block indefinitely (if they don't detect EOF), or process the
entire bundle at once (rather than seeing each table on a separate
line).
Fixes: cb139fa8b3a1 ("table: New function table_format() for formatting a table as a string.") Cc: Ben Pfaff <blp@ovn.org> Cc: Jakub Sitnicki <jsitnicki@gmail.com> Reported-by: Terry Wilson <twilson@redhat.com>
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1608508 Signed-off-by: Aaron Conole <aconole@redhat.com> Suggested-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Terry Wilson <twilson@redhat.com> Tested-by: Terry Wilson <twilson@redhat.com>
In the case there was no sorting criteria the flows on Windows were being
rearranged because it was always returning zero.
Also check if there we need sorting to save a few cycles.
CC: Ben Pfaff <blp@ovn.org> Co-authored-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
Markos Chandras [Wed, 8 Aug 2018 14:27:25 +0000 (17:27 +0300)]
rhel: Use correct user in the logrotate configuration file
The /var/log/openvswitch directory is owned by the openvswitch user but
logrotate could be running as root or as another user. As a result of
which, rpmlint prints the following warning when building the spec file
on SUSE Linux Enterprise:
openvswitch.x86_64: W: suse-logrotate-user-writable-log-dir /var/log/openvswitch openvswitch:openvswitch 0750
The log directory is writable by unprivileged users. Please fix the
permissions so only root can write there or add the 'su' option
to your logrotate config
In order to fix that, we should run the logrotate script as the same
user which runs the various Open vSwitch daemons. If this is a new
installation, then this user is the 'openvswitch' one, but if we are
upgrading from an older release, then the user is normally 'root'.
As such, we set the initial user to 'root' and we fix this up in the
%post scriptlet.
Justin Pettit [Tue, 7 Aug 2018 23:45:26 +0000 (16:45 -0700)]
datapath: meter: Fix setting meter id for new entries
Upstream commit:
From: Justin Pettit <jpettit@ovn.org>
Date: Sat, 28 Jul 2018 15:26:01 -0700
Subject: [PATCH] openvswitch: meter: Fix setting meter id for new entries
The meter code would create an entry for each new meter. However, it
would not set the meter id in the new entry, so every meter would appear
to have a meter id of zero. This commit properly sets the meter id when
adding the entry.
Fixes: 96fbc13d7e77 ("openvswitch: Add meter infrastructure") Signed-off-by: Justin Pettit <jpettit@ovn.org> Cc: Andy Zhou <azhou@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Justin Pettit <jpettit@ovn.org> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Han Zhou [Tue, 7 Aug 2018 02:44:02 +0000 (19:44 -0700)]
ovn-trace: Fix warnings when port is found but not in current datapath.
When port group is used, ovn-trace may print warnings like this:
$ ovn-trace ls1 'inport == "lp111" && eth.src == f0:00:00:00:01:11 && eth.dst == f0:00:00:00:01:12 && ip4.src == 192.168.11.1 && ip4.dst == 192.168.11.2 && ip.ttl == 10'
2018-08-02T01:43:23Z|00001|ovntrace|WARN|lp211: not in datapath ls1
2018-08-02T01:43:23Z|00002|ovntrace|WARN|lp211: unknown logical port
2018-08-02T01:43:23Z|00003|ovntrace|WARN|lp221: not in datapath ls1
2018-08-02T01:43:23Z|00004|ovntrace|WARN|lp221: unknown logical port
2018-08-02T01:43:23Z|00005|ovntrace|WARN|lp231: not in datapath ls1
2018-08-02T01:43:23Z|00006|ovntrace|WARN|lp231: unknown logical port
There are 2 warnings:
For the first one, it might be reasonable
before port group is supported, but now since ports in a port group
can span across multiple datapaths, this situation is normal, and
warning should not be printed.
For the second one, it is misleading, and it should not be printed
in this situation even before port group is supported. It should be
printed only if the port is not found at all.
This patch fixes both.
Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mark Michelson <mmichels@redhat.com>
Han Zhou [Tue, 7 Aug 2018 02:44:01 +0000 (19:44 -0700)]
ovn-northd: Improve efficiency of stateful checking for ACLs on port groups.
Currently in has_stateful_acl(), to check if a datapath has stateful ACLs,
it needs to iterate all port groups and check if the current datapath is
related to each port group, and then iterate the ACLs on the port group. This
is inefficient if there are a lot of port groups. A typical scenario is in
OpenStack each tenant will have a default security group which will be mapped
as a port group, and the default security group is supposed to contain ports
of the tenant only, so most likely only the logical switches belonging to the
tenant should be related to the port group, but we are checking all the port
groups belonging to all tenants for each datapath.
To improve this, a reverse direction of hmap is built from logical switch to
port group, so that the iteration is avoided. The time complexity of this
function improves from O(P * A) to O(PL * A), P = total number of port groups
in NB, PL = number of port groups related to the logical switch, A = number
of ACLs.
Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mark Michelson <mmichels@redhat.com>
wenxu [Sat, 4 Aug 2018 08:31:36 +0000 (16:31 +0800)]
datapath: support upstream ndo_udp_tunnel_add in net_device_ops
It makes datapath can support both ndo_add_udp_tunnel_port and
ndo_add_vxlan/geneve_port. The newer kernels don't support vxlan/geneve
specific NDO's anymore
Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: Ben Pfaff <blp@ovn.org> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Tested-by: Greg Rose <gvrose8192@gmail.com>
After commit ffc2b6ee4174 ("ip_gre: fix IFLA_MTU ignored on NEWLINK")
variable t_hlen is assigned values that are never read,
hence they are redundant and can be removed.
Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Xin Long [Tue, 7 Aug 2018 21:48:52 +0000 (14:48 -0700)]
ip_gre: fix IFLA_MTU ignored on NEWLINK
Upstream commit:
From: Xin Long <lucien.xin@gmail.com>
Date: Tue, 27 Feb 2018 19:19:39 +0800
Subject: [PATCH] ip_gre: fix IFLA_MTU ignored on NEWLINK
It's safe to remove the setting of dev's needed_headroom and mtu in
__gre_tunnel_init, as discussed in [1], ip_tunnel_newlink can do it
properly.
Now Eric noticed that it could cover the mtu value set in do_setlink
when creating a ip_gre dev. It makes IFLA_MTU param not take effect.
So this patch is to remove them to make IFLA_MTU work, as in other
ipv4 tunnels.
[1]: https://patchwork.ozlabs.org/patch/823504/
Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.") Reported-by: Eric Garver <e@erig.me> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Part of this commit already made it into __gre_tunnel_init but
the piece for erspan_tunnel_init did not make it in so fix that
now.
Cc: Xin Long <lucien.xin@gmail.com> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Thu, 12 Jul 2018 21:55:31 +0000 (14:55 -0700)]
tests: Ignore recirc_id in "MPLS xlate action" test.
When I run this test with DPDK enabled, it fails because it ends up using
a different recirculation ID when DPDK is not enabled. I guess that's a
little weird but the recirculation IDs are not supposed to be significant,
so this change makes the test ignore it.