Róbert Mulik [Mon, 23 Apr 2018 11:42:41 +0000 (11:42 +0000)]
Configurable Link State Change (LSC) detection mode
It is possible to set LSC detection mode to polling or interrupt mode
for DPDK interfaces. The default is polling mode. To set interrupt mode,
option dpdk-lsc-interrupt has to be set to true.
For detailed description and usage see the dpdk install documentation.
Signed-off-by: Robert Mulik <robert.mulik@ericsson.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Jan Scheurich [Thu, 19 Apr 2018 17:40:46 +0000 (19:40 +0200)]
dpif-netdev: Detection and logging of suspicious PMD iterations
This patch enhances dpif-netdev-perf to detect iterations with
suspicious statistics according to the following criteria:
- iteration lasts longer than US_THR microseconds (default 250).
This can be used to capture events where a PMD is blocked or
interrupted for such a period of time that there is a risk for
dropped packets on any of its Rx queues.
- max vhost qlen exceeds a threshold Q_THR (default 128). This can
be used to infer virtio queue overruns and dropped packets inside
a VM, which are not visible in OVS otherwise.
Such suspicious iterations can be logged together with their iteration
statistics to be able to correlate them to packet drop or other events
outside OVS.
A new command is introduced to enable/disable logging at run-time and
to adjust the above thresholds for suspicious iterations:
ovs-appctl dpif-netdev/pmd-perf-log-set on | off
[-b before] [-a after] [-e|-ne] [-us usec] [-q qlen]
Turn logging on or off at run-time (on|off).
-b before: The number of iterations before the suspicious iteration to
be logged (default 5).
-a after: The number of iterations after the suspicious iteration to
be logged (default 5).
-e: Extend logging interval if another suspicious iteration is
detected before logging occurs.
-ne: Do not extend logging interval (default).
-q qlen: Suspicious vhost queue fill level threshold. Increase this
to 512 if the Qemu supports 1024 virtio queue length.
(default 128).
-us usec: change the duration threshold for a suspicious iteration
(default 250 us).
Note: Logging of suspicious iterations itself consumes a considerable
amount of processing cycles of a PMD which may be visible in the iteration
history. In the worst case this can lead OVS to detect another
suspicious iteration caused by logging.
If more than 100 iterations around a suspicious iteration have been
logged once, OVS falls back to the safe default values (-b 5/-a 5/-ne)
to avoid that logging itself causes continuos further logging.
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Acked-by: Billy O'Mahony <billy.o.mahony@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Jan Scheurich [Thu, 19 Apr 2018 17:40:45 +0000 (19:40 +0200)]
dpif-netdev: Detailed performance stats for PMDs
This patch instruments the dpif-netdev datapath to record detailed
statistics of what is happening in every iteration of a PMD thread.
The collection of detailed statistics can be controlled by a new
Open_vSwitch configuration parameter "other_config:pmd-perf-metrics".
By default it is disabled. The run-time overhead, when enabled, is
in the order of 1%.
The covered metrics per iteration are:
- cycles
- packets
- (rx) batches
- packets/batch
- max. vhostuser qlen
- upcalls
- cycles spent in upcalls
This raw recorded data is used threefold:
1. In histograms for each of the following metrics:
- cycles/iteration (log.)
- packets/iteration (log.)
- cycles/packet
- packets/batch
- max. vhostuser qlen (log.)
- upcalls
- cycles/upcall (log)
The histograms bins are divided linear or logarithmic.
2. A cyclic history of the above statistics for 999 iterations
3. A cyclic history of the cummulative/average values per millisecond
wall clock for the last 1000 milliseconds:
- number of iterations
- avg. cycles/iteration
- packets (Kpps)
- avg. packets/batch
- avg. max vhost qlen
- upcalls
- avg. cycles/upcall
The gathered performance metrics can be printed at any time with the
new CLI command
-nh: Suppress the histograms
-it iter_len: Display the last iter_len iteration stats
-ms ms_len: Display the last ms_len millisecond stats
-pmd core: Display only the specified PMD
The performance statistics are reset with the existing
dpif-netdev/pmd-stats-clear command.
The output always contains the following global PMD statistics,
similar to the pmd-stats-show command:
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Acked-by: Billy O'Mahony <billy.o.mahony@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Jan Scheurich [Thu, 19 Apr 2018 17:40:44 +0000 (19:40 +0200)]
netdev: Add optional qfill output parameter to rxq_recv()
If the caller provides a non-NULL qfill pointer and the netdev
implemementation supports reading the rx queue fill level, the rxq_recv()
function returns the remaining number of packets in the rx queue after
reception of the packet burst to the caller. If the implementation does
not support this, it returns -ENOTSUP instead. Reading the remaining queue
fill level should not substantilly slow down the recv() operation.
A first implementation is provided for ethernet and vhostuser DPDK ports
in netdev-dpdk.c.
This output parameter will be used in the upcoming commit for PMD
performance metrics to supervise the rx queue fill level for DPDK
vhostuser ports.
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Acked-by: Billy O'Mahony <billy.o.mahony@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Pablo Cascón [Fri, 27 Apr 2018 16:40:49 +0000 (17:40 +0100)]
netdev-dpdk: don't enable scatter for jumbo RX support for nfp
Currently to RX jumbo packets fails for NICs not supporting scatter.
Scatter is not strictly needed for jumbo RX support. This change fixes
the issue by not enabling scatter only for the PMD/NIC known not to
need it to support jumbo RX.
Note: this change is temporary and not needed for later releases OVS/DPDK
Reported-by: Louis Peens <louis.peens@netronome.com> Signed-off-by: Pablo Cascón <pablo.cascon@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Kevin Traynor [Wed, 25 Apr 2018 11:20:53 +0000 (12:20 +0100)]
faq: Document DPDK version maintenance.
The faq already shows the DPDK versions that were
used with each OVS version. Give information about
DPDK stable and LTS releases, so the user can understand
if those versions are maintained.
Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Han Zhou [Thu, 10 May 2018 06:32:04 +0000 (23:32 -0700)]
ovn-nbctl: Support ACL commands on port groups.
Add support for using ovn-nbctl to add/delete/list ACLs on port
groups.
A new option --type is also supported for these commands to
explicitely specify, when needed, whether the operation is on a
port-group or a logical switch. E.g.
When an OpenFlow Bundle Add message is received, a bundle entry is
created and the OpenFlow message embedded in the bundle add message is
processed. If any error is encountered while processing the embedded
message, the bundle entry is freed. The bundle entry free function
assumes that the entry has been populated with a properly formatted
OpenFlow message and performs some message specific cleanup actions .
This assumption does not hold true in the error case and OVS crashes
when performing the cleanup.
The fix is in case of errors, simply free the bundle entry without
attempting to perform any embedded message cleanup
Signed-off-by: Anju Thomas <anju.thomas@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Timothy Redaelli [Thu, 10 May 2018 15:21:41 +0000 (17:21 +0200)]
rhel: openvswitch-fedora.spec.in: Specify PYTHON and PYTHON3
Currently python2 and python3 binaries are searched by following the
PATHs, but, on Fedora, the python2 package does not provides /bin/python2
and so if the PATH contains /bin before /usr/bin (for example by using
the ansible poc) then the resulting RPM file will require /bin/python2
instead of /usr/bin/python2 and this breaks some tools (for example
createrepo).
This patch specify the full path of python2 interpreter and,
if python3-openvswitch package is built, the full path of python3
interpreter.
Avoid crash in OvS while transmitting fragmented packets over tunnel.
Currently when fragmented packets are to be transmitted in to tunnel,
base_flow->nw_frag which was initially non-zero at reception is not
reset to zero when the base_flow and flow are rewritten
as part of the emulated tnl_push action in the ofproto-dpif-xlate
module.
Because of this when fragmented packets are transmitted out of tunnel,
we hit crash caused by the following assert.
With the following change propagate_tunnel_data_to_flow__
is modified to reset *nw_frag* to zero. Also, that currently we don't
fragment tunnelled packets, we should reset *nw_frag* to zero in
propagate_tunnel_data_to_flow__.
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
From: Rohith Basavaraja <rohith.basavaraja@ericsson.com> CC: Jan Scheurich <jan.scheurich@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Lorenzo Bianconi [Thu, 26 Apr 2018 14:35:46 +0000 (16:35 +0200)]
ovn-nbctl: Show gw chassis in decreasing prio order.
Report gateway chassis in decreasing priority order running ovn-nbctl
show sub-command. Add get_ordered_gw_chassis_prio_list routine to sort
gw chassis according to the configured priority
Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Some commands are not shown in code blocks in the Advances Features
tutorial, they are shown as variable width text because of a missing ":"
to designate them as code blocks.
Signed-off-by: Axel Tripier <axel@tripier.fr> Signed-off-by: Ben Pfaff <blp@ovn.org>
The OVN load balance tests are failing in both kernel and userspace DP.
The problem is due to bad parsing of the load balance keys because of
using the wrong default port mode in the call to inet_parse_active().
With this fix, the tests are now passing again.
system-ovn
100: ovn -- load-balancing ok
101: ovn -- load-balancing - same subnet. ok
102: ovn -- load balancing in gateway router ok
103: ovn -- multiple gateway routers, load-balancing ok
104: ovn -- load balancing in router with gateway router port ok
Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mark Michelson <mmichels@redhat.com>
The function inet_parse_active() is an external API and used
as one stop shopping for parsing ip address and L4 port
combinations from many other modules. Hence, the function
header is extended to describe the special cases that it
handles.
Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Markos Chandras [Fri, 4 May 2018 14:35:12 +0000 (15:35 +0100)]
rhel: openvswitch-fedora.spec.in: Drop explicit usermod/groupadd deps
These dependencies have been moved from the %post to the %pre scriptlet
in f624bf23b62a ("rhel: user/group openvswitch does not exist") and are
already provided by the shadow-utils package so we can simply drop
them.
Cc: Alan Pevec <alan.pevec@redhat.com> Cc: Aaron Conole <aconole@redhat.com> Signed-off-by: Markos Chandras <mchandras@suse.de> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Aaron Conole <aconole@redhat.com>
Han Zhou [Tue, 8 May 2018 22:11:52 +0000 (15:11 -0700)]
ovn-controller: Make the local-only flow generic.
The flow that handles MLF_LOCAL_ONLY flag is now added for each
multicast group, but in fact it can be more generic and only one
is needed rather than per mc group.
Suggested-by: Ben Pfaff <blp@ovn.org>
Suggested-at: https://mail.openvswitch.org/pipermail/ovs-dev/2018-May/346719.html Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Wed, 9 May 2018 18:18:15 +0000 (11:18 -0700)]
tests: Avoid OVN environment variables leaking into tests.
If $OVN_NB_DB or $OVN_SB_DB happened to be set in the environment in which
"make check" was invoked, then their values would leak into the tests'
environment and interfere with the tests. This commit avoids that problem.
OVS has a number of environment variables too, such as OVS_RUNDIR, but
the tests already set those to custom values.
Reported-by: Han Zhou <zhouhan@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Han Zhou <hzhou8@ebay.com>
Han Zhou [Sun, 22 Apr 2018 16:52:34 +0000 (09:52 -0700)]
ovn: support applying ACLs to port groups
Although port group can be used in match conditions of ACLs, it is
still inconvenient for clients to figure out the lswitches that each
ACL should be applied to.
This patch supports applying ACLs to port groups directly instead of
applying to each related lswitch individually. It provides convenience
for clients such as k8s and OpenStack Neutron.
Requested-at: https://mail.openvswitch.org/pipermail/ovs-dev/2018-March/344856.html Requested-by: Guru Shetty <guru@ovn.org> Requested-by: Daniel Alvarez Sanchez <dalvarez@redhat.com> Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Greg Rose [Fri, 20 Apr 2018 18:13:07 +0000 (11:13 -0700)]
compat: Fix upstream 4.4.119 kernel
The Linux 4.4.119 kernel (and perhaps others) from kernel.org
backports some dst_cache code that breaks the openvswitch kernel
due to a duplicated name "dst_cache_destroy". For most cases the
"USE_UPSTREAM_TUNNEL" covers this but in this case the dst_cache
feature needs to be separated out.
Add the necessary compatibility detection layer in acinclude.m4 and
then fixup the source files so that if the built-in kernel includes
dst_cache support then exclude our own compatibility code.
Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
Han Zhou [Mon, 30 Apr 2018 17:04:54 +0000 (10:04 -0700)]
ovn.at: fix occasional dns lookup test failure
The option "--wait=hv" was missed when adding back the DNS options
for ls1-lp1 using ovn-nbctl, so the test case failed occasionally.
This commit fix the same.
Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Han Zhou [Fri, 4 May 2018 02:39:33 +0000 (19:39 -0700)]
ovn.at: fix timing in test case /32 router IP address
After mac binding is populated in SB, before sending a packet, we
should ensure HVs processed this SB change. This patch ensures it
by: ovn-nbctl --wait=hv sync.
Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Han Zhou [Tue, 8 May 2018 16:42:07 +0000 (09:42 -0700)]
ovn.at: fix occasional failure - options:requested-chassis for logical port
There are two problems in this test case that lead to occasional
failures.
Firstly, hv1_uuid or hv2_uuid could be empty because they
may be assigned before SB is updated, so adding ovn-sbctl wait-until
fixes it.
Secondly, when port-binding chassis is queried from SB,
it could be in the transient state - updated by the releasing chassis
to empty, but not yet updated by the requested chassis to the new
uuid. Although it is retried by OVS_WAIT_UNTIL, an empty string leads
to syntax error of the "test ... = ..." command and errors out immediately
in that case. So adding prefix "x" to both side of the test command
fixes it.
Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Han Zhou [Tue, 8 May 2018 17:29:40 +0000 (10:29 -0700)]
ovn-controller.at: fix occasional ovn-bridge-mappings test failure
This patch fixes the time issue in the test case so that when external-ids
is updated in Open_vSwitch OVSDB, give some time for SB OVSDB to get
updated by ovn-controller.
Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Han Zhou [Tue, 8 May 2018 01:09:57 +0000 (18:09 -0700)]
ovn: Fix occasional failure in gratuitous ARP for NAT rules test.
In this test case it didn't wait for all HVs to catch up, which
leads to occasional failures due to timing. This fix updates
the --wait=sb to --wait=hv, which fixes the problem.
Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
ovs-vsctl: Fix segfault when attempting to del-port from parent bridge.
The error message in the check for improper bridge param is de-referencing
parent from the wrong bridge. Also, the message itself had the parent and
child bridges reversed, so that got a small tweak as well.
Also, add a regression test.
Signed-off-by: Flavio Fernandes <flavio@flaviof.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
ovn-ctl: Handle whitespaces when using eval for start_ovsdb:
eval doesn't understand white space for local var which was introduced in commit 79c7961b8b3c4b7ea0251dea2ffacfa84c84fecb for starting clustered ovn dbs.
As ovn-ctl uses sh instead of bash, parsing local var with white space will fail.
gives error: /usr/share/openvswitch/scripts/ovn-ctl: 1: local: -vfile:info: bad variable name
As a result ovsdb fails to even initialize and start. Hence, we need to seperate local keyword for all
variables used with eval to make it work with both dash and bash.
Signed-off-by: aginwala <aginwala@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Yi-Hung Wei [Thu, 3 May 2018 16:49:50 +0000 (09:49 -0700)]
ofproto-dpif-xlate: Fix segmentation fault caused by tun_table
Currently, the revalidator thread may hit segmentation fault when geneve
TLV map is updated. It is because we may store the old TLV map (struct
tun_table) in the frozen state for recirculation, and we may access the
already freed old tun_table in xlate_actions().
This patch update the logic of getting tun_table so that we will use
the latest tun_table instead of the frozen one.
VMWare-BZ: #2106987 Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Alan Pevec [Thu, 19 Apr 2018 15:27:09 +0000 (11:27 -0400)]
rhel: user/group openvswitch does not exist
Default ownership[1] for config files is failing on an empty system:
Running scriptlet: openvswitch-2.9.0-3.fc28.x86_64
warning: user openvswitch does not exist - using root
warning: group openvswitch does not exist - using root
...
Required user/group need to be created in %pre as documented in
Fedora guideline[2]
Jianbo Liu [Tue, 1 May 2018 12:36:06 +0000 (12:36 +0000)]
odp-util: Remove unnecessary TOS ECN bits rewrite for tunnels
For tunnels, TOS ECN bits are never wildcard for the reason that they
are always inherited. OVS will create a rewrite action if we add rule
to modify other IP headers. But it also adds an extra ECN rewrite for
the action because of this ECN un-wildcarding.
It seems no error because the ECN bits to be changed are same in this
case. But as rule can't be offloaded to hardware, the unnecssary ECN
rewrite should be removed.
Signed-off-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
Ben Pfaff [Thu, 26 Apr 2018 16:39:30 +0000 (09:39 -0700)]
ovsdb.7: Clarify description of OVSDB.
A reader reported that "network database system" made it sound like OVSDB
was specialized for databases about networks. It's not, it's just
accessible over the network.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Justin Pettit <jpettit@ovn.org>
In case where "use_names" is set (e.g. in an interactive session) to show
the port and table names when ovs-ofctl is run with snoop command,
ovs-ofctl would get stuck in an endless loop inside "table_iterator_next"
function's while loop checking for "while (ti->send_xid != recv_xid)".
This would happening because the "vconn" to "<bridge>.snoop" socket would
not respond to TABLE_FEATURES_REQUEST sent by ovs-ofctl.
This commit disables showing port or table names in the snoop command.
Signed-off-by: Ashish Varma <ashishvarma.ovs@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Jianbo Liu [Wed, 25 Apr 2018 08:09:08 +0000 (08:09 +0000)]
lib/tc: Remove unnecessary icmp recalculation
ICMP checksum is calculated from ICMP headers and data, so hardware doesn't
need to calculate it again because we only rewrite IP headers.
Signed-off-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
Greg Rose [Tue, 17 Apr 2018 19:34:08 +0000 (12:34 -0700)]
datapath: Prevent panic
On RHEL 7.x kernels we observe a panic induced by a paging error
when the timer kicks off a job that subsequently accesses memory
that belonged to the openvswitch kernel module but was since
unloaded - thus the paging error.
The panic can be induced on any RHEL 7.x kernel with the following test:
while `true`
do
make check-kmod TESTSUITEFLAGS="-k \!gre"
done
On the systems I've been testing on it generally takes anywhere from a
minute to 15 minutes or so to repro but never longer than that. Similar
results have been seen by other testers.
This patch does not fix the underlying bug, which does need to be
investigated and fixed, but it does prevent it from occurring. We
would like to prevent customer systems from panicking while we do
futher investigation to find the root cause.
Marcin Rybka [Fri, 20 Apr 2018 13:46:27 +0000 (14:46 +0100)]
tests: Add system-dpdk-testsuite
New OVS-DPDK testsuite, which can be launched via `make check-dpdk`,
tests OVS using a DPDK datapath. The testsuite contains already
initial tests:
1. EAL init
2. Add standard DPDK PHY port
3. Add vhost-user-client port
Signed-off-by: Marcin Rybka <marcinx.rybka@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Stephen Finucane [Fri, 20 Apr 2018 09:24:32 +0000 (10:24 +0100)]
docs: Clarify changes in Rx queue allocation
Two mistakes here:
- Automatic assignment of Rx queues to PMD threads has always existed -
it was simply switched from round-robin allocation to
utilization-based allocation
- The above, along with the 'pmd-rxq-rebalance' command, was added in
OVS 2.9.0 - not OVS 2.8.0 - while the 'pmd-rxq-show' command was added
in OVS 2.6.0 and modified in OVS 2.9.0
Correct both of these and modify the NEWS entry for this to clarify
things a little (it took a bit of git spelunking and bothering people on
IRC to figure out).
Signed-off-by: Stephen Finucane <stephen@that.guru> Cc: Kevin Traynor <ktraynor@redhat.com> Cc: Ian Stokes <ian.stokes@intel.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Stephen Finucane [Thu, 19 Apr 2018 12:57:24 +0000 (13:57 +0100)]
doc: Add "vdev" topic document
These are separate things from physical, ring and vhost-user interfaces
and deserve their own documents. A couple of small typos are fixed along
the way.
Signed-off-by: Stephen Finucane <stephen@that.guru> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Stephen Finucane [Thu, 19 Apr 2018 12:57:23 +0000 (13:57 +0100)]
doc: Move additional sections to "physical ports" doc
The "hotplugging", "flow control", and "Rx checksum offload" sections
only apply to 'dpdk' ports and are too detailed to include in a
high-level howto. Move them, reworking some aspects of this in the
process.
Signed-off-by: Stephen Finucane <stephen@that.guru> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Stephen Finucane [Thu, 19 Apr 2018 12:57:22 +0000 (13:57 +0100)]
doc: Add "PMD" topic document
This continues the breakup of the huge DPDK "howto" into smaller
components. There are a couple of related changes included, such as
using "Rx queue" instead of "rxq" and noting how Tx queues cannot be
configured.
Signed-off-by: Stephen Finucane <stephen@that.guru> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Stephen Finucane [Thu, 19 Apr 2018 12:57:21 +0000 (13:57 +0100)]
doc: Add an overview of the 'dpdk' port
These ports are used to allow ingress/egress from the host and are
therefore _reasonably_ important. However, there is no clear overview of
what these ports actually are or why things are done the way they are.
Start closing this gap by providing a standalone example of using these
ports along with a little more detailed overview of the binding process.
There is additional cleanup to be done for the DPDK howto, but that will
be done separately.
We enable the TODO directive so we can actually start calling out some
TODOs.
Signed-off-by: Stephen Finucane <stephen@that.guru> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Ian Stokes [Wed, 18 Apr 2018 12:30:42 +0000 (13:30 +0100)]
docs: Fix urls in index.rst.
This patch prepends 'www' to openvswitch urls in index.rst. Without this
make check-docs fails when verifying url liveness. Also remove url
referencing ovsdb-server(5) as these are no longer accessible.
Cc: Stephen Finucane <stephen@that.guru> Fixes: 4f6ec357c ("doc: Populate 'ref' section") Signed-off-by: Ian Stokes <ian.stokes@intel.com> Acked-by: Stephen Finucane <stephen@that.guru>
Ian Stokes [Wed, 18 Apr 2018 10:17:12 +0000 (11:17 +0100)]
docs: Fix sphinx urls.
Update dead url links for sphinx documentation to avoid
make check-docs failing.
Cc: Stephen Finucane <stephen@that.guru> Fixes: 26ea2d409 ("docs: Add writing guide") Fixes: 73c76b447 ("doc: Add info on building documentation") Signed-off-by: Ian Stokes <ian.stokes@intel.com> Acked-by: Stephen Finucane <stephen@that.guru>
Ian Stokes [Wed, 18 Apr 2018 09:54:09 +0000 (10:54 +0100)]
docs: Fix sflow documentation url and markup.
The link url link for the blog in sflow documentation causes make
check-docs to fail with a broken link warning. Fix this by correcting
the url address. Also use correct markup for note regarding the
configuration of sflow.
CC: Stephen Finucane <stephen@that.guru> Fixes: 198c5d3d0 ("doc: Add sFlow cookbook from website") Signed-off-by: Ian Stokes <ian.stokes@intel.com> Acked-by: Stephen Finucane <stephen@that.guru>
Kevin Traynor [Fri, 13 Apr 2018 17:40:13 +0000 (18:40 +0100)]
netdev-dpdk: Free mempool only when no in-use mbufs.
DPDK mempools are freed when they are no longer needed.
This can happen when a port is removed or a port's mtu
is reconfigured so that a new mempool is used.
It is possible that an mbuf is attempted to be returned
to a freed mempool from NIC Tx queues and this can lead
to a segfault.
In order to prevent this, only free mempools when they
are not needed and have no in-use mbufs. As this might
not be possible immediately, create a free list of
mempools and sweep it anytime a port tries to get a
mempool.
When explaining on how to add vhost-user ports to a guest, using
libvirt, the following piece of configuration is used:
<disk type='dir' device='disk'>
<driver name='qemu' type='fat'/>
<source dir='/usr/src/dpdk-stable-17.11.1'/>
<target dev='vdb' bus='virtio'/>
<readonly/>
</disk>
This is used to facilitate sharing of a DPDK directory between the host
and the guest. However, for this to work selinux also needs to be
configured (or disabled). Furthermore, if one is using Ubuntu, libvirtd
would need to be added to complain only in AppArmor. Instead, in [1] it
is advised to use wget to get the DPDK sources over the internet, which
avoids this differentiation. Thus, we drop this piece of configuration
here as well and keep the example configuration as simple as possible.
This has been verified on both a Fedora 27 image and a Ubuntu 16.04 LTS
image.
When explaining on how to add vhost-user ports to a guest, using
libvirt, point to the qemu-system-x86_64 binary by default, instead of
using qemu-kvm. The latter has been made obsolete and dropped from a
number of distributions (although it is still available on Fedora).
This has been verified on both a Fedora 27 image and a Ubuntu 16.04 LTS
image.
ofproto-dpif-upcall: Only call ovsrcu_postpone() on active actions
Currently, ovsrcu_postpone() is called even with a NULL argument,
i.e. when there is no data to be freed. This is causing additional
overhead because work is scheduled for the urcu thread. This change
avoids adding the postpone callback if no work needs to be done.
This especially helps for the OVS-DPDK case where the PMD threads
might no longer have to do a write() due to the latch_set(), and thus
saving a syscall.
Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
xlate: Move tnl_neigh_snoop() to terminate_native_tunnel()
Currently OVS snoops any ARP or ND packets in any bridge and populates
the tunnel neighbor cache with the retreived data. For instance, when
an ARP reply originated by a tenant is received in an overlay bridge, the
ARP packet is snooped and tunnel neighbor cache is filled with tenant
address information. This is at best useless as tunnel endpoints can only
reside on an underlay bridge.
The real problem starts if different tenants on the overlay bridge have
overlapping IP addresses such that they keep overwriting each other's
pseudo tunnel neighbor entries. These frequent updates are treated as
configuration changes and trigger revalidation each time, thus causing
a lot of useless revalidation load on the system.
To keep the ARP neighbor cache clean, this patch moves tunnel neighbor
snooping from the generic function do_xlate_actions() to the specific
funtion terminate_native_tunnel() in compose_output_action(). Thus,
only ARP and Neighbor Advertisement packets addressing a local
tunnel endpoint (on the LOCAL port of the underlay bridge) are snooped.
In order to achieve this, IP addresses of the bridge ports are retrieved
and then stored in xbridge by calling xlate_xbridge_set(). The
destination address extracted from the ARP or Neighbor Advertisement
packet is then matched against the known xbridge addresses in
is_neighbor_reply_correct() to filter the snooped packets further.
Signed-off-by: Zoltan Balogh <zoltan.balogh.eth@gmail.com> Co-authored-by: Jan Scheurich <jan.scheurich@ericsson.com> Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
tests: Inject ARP replies for snoop tests on different port
The ARP replies injected into the underlay bridge 'br0' to trigger
ARP snooping should be destined to the the bridges LOCAL port. So far
the tests injected them on LOCAL port 'br0' itself, which didn't matter
as OVS snooped on all ARP packets passing the bridge.
This patch injects the ARP replies on a different port in preparation for
an upcoming commit that will make OVS only snoop on ARP packets output
to the LOCAL port.
The clone() wrapper must be added to the generated datapath flows now as
the traced packets would actually be transmitted through the tunnel port.
Previously the underlay bridge dropped the packets as the learned egress
port for the tunnel nexthop was the LOCAL port, which also served as
virtual ingress port for the encapsulated traffic. The translation
end result was an expensive way to say 'drop'.
Signed-off-by: Zoltan Balogh <zoltan.balogh.eth@gmail.com> Co-authored-by: Jan Scheurich <jan.scheurich@ericsson.com> Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
ofproto-dpif-xlate: makes OVS native tunneling honor tunnel-specified source addresses
It makes OVS native tunneling honor tunnel-specified source addresses,
in the same way that Linux kernel tunneling honors them.
This patch made valid tun_src specified by flow-action can be used for
tunnel_src of packet. add a "local" property for a route entry and enhance
the priority of local route higher than user route.
Like the kernel space when lookup the route, if there are tun_src specified
by flow-action or port options. Check the tun_src wheather is a local
address, then lookup the route.
Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: frank.zeng <frank.zeng@ucloud.cn> Signed-off-by: Ben Pfaff <blp@ovn.org>
lacp: New command "lacp/show-stats" fordisplaying LACP counters.
Currently OVS does not provide any command to display stats for LACP
without which it is difficult to debug LACP issues. Here we propose
to display various statistics about LACP PDUs and slave state change.
tutorial: skip passing .db for backup option for ovn_start_ovsdb_server:
current params uses two sb1.db which is redundant:
e.g. ovsdb-server --remote=punix:sb1.ovsdb sb1.db sb1.db
expected:
e.g. ovsdb-server --remote=punix:sb1.ovsdb sb1.db
tested and works as expected:
ovn-sbctl --db=unix:/root/ovs/tutorial/sandbox/sb2.ovsdb show
Chassis "chassis-1"
hostname: sandbox
Encap geneve
ip: "127.0.0.1"
options: {csum="true"}
Signed-off-by: aginwala <aginwala@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Jakub Sitnicki [Wed, 18 Apr 2018 16:01:13 +0000 (18:01 +0200)]
tests: Extend Python IDL checks to also run with SSL
Extend the macro for running a Python IDL test against an OVSDB server
that uses SSL so that it can be used for regular IDL tests and for the
notify tests.
This makes it easy to generate additional Python IDL tests that run
using SSL, so do it.
As it turns out, newly added SSL tests unearth a pre-existing issue with
unicode encoding when SSL is used, which will be fixed in the following
patch.
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Jakub Sitnicki [Wed, 18 Apr 2018 16:01:14 +0000 (18:01 +0200)]
python: Fix a double encoding attempt on an Unicode string
Encoding from 'unicode' to 'str' that has been added to the Stream class
in commit 2254074e3067 ("python: fix python3 encode/decode on Windows")
conflicts with SSLStream which already contains a quirk for pyopenssl
that does the same thing.
This results in a double encoding attempt when SSL is used and we crash
and burn due to:
Traceback (most recent call last):
File "../.././test-ovsdb.py", line 874, in <module>
main(sys.argv)
File "../.././test-ovsdb.py", line 869, in main
func(*args)
File "../.././test-ovsdb.py", line 655, in do_idl
idl_set(idl, command, step)
File "../.././test-ovsdb.py", line 526, in idl_set
status = txn.commit_block()
File "/home/jkbs/src/ovs/python/ovs/db/idl.py", line 1405, in commit_block
status = self.commit()
File "/home/jkbs/src/ovs/python/ovs/db/idl.py", line 1388, in commit
if not self.idl._session.send(msg):
File "/home/jkbs/src/ovs/python/ovs/jsonrpc.py", line 540, in send
return self.rpc.send(msg)
File "/home/jkbs/src/ovs/python/ovs/jsonrpc.py", line 244, in send
self.run()
File "/home/jkbs/src/ovs/python/ovs/jsonrpc.py", line 203, in run
retval = self.stream.send(self.output)
File "/home/jkbs/src/ovs/python/ovs/stream.py", line 808, in send
return super(SSLStream, self).send(buf)
File "/home/jkbs/src/ovs/python/ovs/stream.py", line 391, in send
buf = buf.encode('utf-8')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 83: ordinal not in range(128)
Remove the quirk from SSLStream as the base class now does encoding.
Reported-by: Marcin Mirecki <mmirecki@redhat.com> Signed-off-by: Jakub Sitnicki <jkbs@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Jakub Sitnicki [Wed, 18 Apr 2018 16:01:12 +0000 (18:01 +0200)]
tests: Reuse OVSDB_CHECK_IDL_PYN macro for IDL notify tests
OVSDB_CHECK_IDL_NOTIFY_PYN macro is the same as OVSDB_CHECK_IDL_PYN
except it doesn't support PRE-IDL-TXN parameter. Reuse the more generic
OVSDB_CHECK_IDL_PYN macro.
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Jakub Sitnicki [Wed, 18 Apr 2018 16:01:11 +0000 (18:01 +0200)]
tests: Remove useless run-if-fail commands passed to AT_CHECK
Path to ovsdb-server's pidfile has changed long ago when
ovsdb_start_idltest() helper was introduced in commit 561205007e17
("tests: Get rid of overly specific --pidfile and --unixctl options.")
but the run-if-fail commands were left behind.
Besides, we don't need to kill the ovs-db-server from the AT_CHECK
anymore since ovsdb_start_idltest() registers an on_exit hook that will
do it.
Clean up any run-if-fail commands that attempt to kill ovsdb-server
using an invalid pidfile.
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Jakub Sitnicki [Wed, 18 Apr 2018 16:01:10 +0000 (18:01 +0200)]
tests: Complain if key and certs not provided for SSL connection
Add an argument check to test-ovsdb.py to ensure that the user has
provided the private key, the certificate, and the peer CA certificate
needed to set up an SSL connection.
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Recently, an issue was debugged that was thought to be a bond
failover triggered issue. It turned out to an vlan interface MTU set issue
that had nothing to do with bonding or most other likely possibilities.
Besides the effect of not setting the MTU to the desired value, this can
result in increased netlink traffic and processing with associated wasted
work. Let us flag a configuration issue at warn level (rather than dbg) to
catch the problem early.
Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Jakub Sitnicki [Wed, 18 Apr 2018 10:35:08 +0000 (12:35 +0200)]
python: Fix reporting that test-ovsdb.py command needs more args
In Python OVSDB tester, we are not unpacking a value from n_args tuple
that holds the accepted range of arguments. This causes an error:
$ python tests/test-ovsdb.py idl tests/idltest.schema
Traceback (most recent call last):
File "./tests/test-ovsdb.py", line 869, in <module>
main(sys.argv)
File "./tests/test-ovsdb.py", line 852, in main
n_args, len(args)))
TypeError: %d format: a number is required, not tuple
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mark Michelson <mmichels@redhat.com>
Terry Wilson [Fri, 13 Apr 2018 00:24:27 +0000 (19:24 -0500)]
Add multi-column index support for the Python IDL
This adds multi-column index support for the Python IDL that is
similar to the feature in the C IDL. Since it adds sortedcontainers
as a dependency and some distros don't yet package it, the library
is copied in-tree and used if sortedcontainers is not installed.
Signed-off-by: Terry Wilson <twilson@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
rhel: Fix literal dollar sign usage in systemd service files
Currently (at least on RHEL 7.5) openvswitch fails to start (with DPDK
enabled) as non-root, since chown fails and "/dev/hugepages" group is not
changed.
Commit tested on Fedora 28 and RHEL 7.5, both as root as non-root user.
From man 5 systemd.service:
To pass a literal dollar sign, use "$$". Variables whose value is not known
at expansion time are treated as empty strings. Note that the first argument
(i.e. the program to execute) may not be a variable.
CC: Aaron Conole <aconole@redhat.com> Fixes: 4299145c1095 ("rhel: don't drop capabilities when running as root") Signed-off-by: Timothy Redaelli <tredaelli@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Aaron Conole <aconole@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com>
Ben Pfaff [Tue, 17 Apr 2018 15:33:41 +0000 (08:33 -0700)]
netdev: Fix typos in comment.
Fixes: ee4776b8bce1 ("netdev: New function netdev_get_ip_by_name().") Suggested-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>