Russell Bryant [Thu, 26 Oct 2017 21:33:18 +0000 (14:33 -0700)]
Introduce Emeritus Committer status.
This patch introduces an Emeritus status for OVS committers. An
Emeritus Committer is recognized as having made a significant impact
to the project and having been a committer in the past. It is
intended as an option for those that do not currently have the time or
interest to fulfill committer responsibilities based on their current
responsibilities. While in this status, they are not included in
voting for governance purposes.
An emeritus committer may be re-instated as a full committer at any
time.
The OVS committers voted approval of this change.
See documentation contents for full details.
Suggested-by: Ethan J. Jackson <ejj@eecs.berkeley.edu> Acked-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ethan J. Jackson <ethan@kelda.io> Signed-off-by: Russell Bryant <russell@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Thu, 26 Oct 2017 23:49:01 +0000 (16:49 -0700)]
dpif-netdev: Initialize new rxqs in port_reconfigure().
valgrind reported use of uninitialized data in port_reconfigure(), which
was due to xrealloc() not initializing the newly added data, combined with
dp_netdev_rxq_set_intrvl_cycles() reading 'intrvl_idx' from the added data.
This avoids the warning.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Kevin Traynor <ktraynor@redhat.com>
ovs-save: Use a file to restore flows instead of heredoc
This patch makes ovs-save to use a file to restore flows instead of using
shell script here-document.
This is needed since eval + here-documents are much slower than reading a file
with the rules directly.
rhel: Add systemd support to delete transient ports only on boot
Using the dependencies feature of systemd ovs-delete-transient-ports.service
is only started once and so transient ports are only deleted only the first
time after boot.
Daniel Alvarez [Thu, 26 Oct 2017 12:52:22 +0000 (14:52 +0200)]
Add dl_type to flow metadata for correct interpretation of conntrack metadata
When a packet is sent to the controller, dl_type is not stored in the
'ofputil_packet_in_private'. When the packet is resumed, the flow's
dl_type is 0 and this leads to invalid value in ct_orig_tuple in the
pkt_metadata.
This patch adds the dl_type to the metadata so that conntrack
information can be interpreted correctly when packets are resumed.
This is a change from the ordinary practice, since flow_get_metadata() is
only supposed to deal with metadata and dl_type is not metadata. It is
necessary when ct_state is involved, though, because ct_state only applies
in the case of particular Ethertypes (IPv4 and IPv6 currently), so we need
to add it as a kind of prerequisite. (This isn't ideal; maybe we didn't
think through the ct_state mechanism carefully enough.)
Reported-by: Daniel Alvarez Sanchez <dalvarez@redhat.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-October/339868.html Signed-off-by: Daniel Alvarez <dalvarez@redhat.com> Signed-off-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
In the case of openstack load balancer 'octavia' service, it creates logical
ports 'P1' (M1 IP1) and 'P2' (M2 IP2). It then disables logical port P2 and
adds IP2 to P1 - (M1 IP1 IP2).
When another port tries to reach IP2, it doesn't get delivered to port P1 because
of the above flow.
Signed-off-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Numan Siddique [Mon, 16 Oct 2017 09:42:07 +0000 (15:12 +0530)]
ovn pacemaker: Provide the option to configure inactivity probe value
In the case of OVN HA deployments with openstack, it has been noticed
that the 5 seconds inactivity probe interval is not enough and ovsdb-servers
do not get the echo reply back from the IDL clients and disconnects the
connections.
This patch
- providdes an option to configure this value.
- creates a connection row in NB/SB dbs and sets the target and
inactivity_probe values when the node is promoted to master.
CC: Andy Zhou <azhou@ovn.org> Signed-off-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Numan Siddique [Wed, 25 Oct 2017 18:03:03 +0000 (23:33 +0530)]
Check flow's dl_type before setting ct_orig_tuple in 'pkt_metadata_from_flow()'
Normally flow's dl_type will be a valid value. However when a packet is sent to
the controller, dl_type is not stored in the 'ofputil_packet_in_private'. When
the controller resumes (OFPRAW_NXT_RESUME) the packet, the flow's dl_type will be
0. If the flow's ct_state has valid value, then the 'pkt_metadata_from_flow'
neither sets the ct_orig_tuple from the flow nor resets it. This results in invalid
value ct_orig_tuple in the pkt_metadata.
This patch handles this situation by checking the dl_type before setting the
ct_orig_tuple. If dl_type is 0, it resets it. It also resets ct_orig_tuple if
dl_type is non zero and other than IPv4 or IPv6.
Reported-by: Daniel Alvarez Sanchez <dalvarez@redhat.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-October/339868.html Signed-off-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
tests/stp: Use long warps instead of multiple calls.
This change fixes constant test failure on RHEL 7 system with
CFLAGS='-march=native':
>---------------------------------------------------------------<
2222: STP - flush the fdb and mdb when topology changed FAILED
...
./stp.at:609: ovs-appctl fdb/show br0
--- -
+++ ./tests/testsuite.dir/at-groups/2222/stdout
@@ -1,2 +1,3 @@
port VLAN MAC Age
+LOCAL 1 00:0c:29:a0:27:d1 33
>---------------------------------------------------------------<
Long warps takes threads a chance to perform some work on each
step unlike multiple appctl calls. Also, code looks cleaner and
works faster.
CC: Tonghao Zhang <xiangxia.m.yue@gmail.com> Fixes: 427e9751f300 ("tests: Add and improve stp tests.") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ilya Maximets [Fri, 6 Oct 2017 07:15:42 +0000 (10:15 +0300)]
tests: Add timeout to OVS_APP_EXIT_AND_WAIT.
ovs-appctl can wait indefinitely while executing an exit for a
dead service. Let's add a timeout (10 seconds should be
reasonable) to exit calls to avoid hanging up of the testsuite
in such cases.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Wed, 27 Sep 2017 20:50:26 +0000 (13:50 -0700)]
tests: Add support for 1-argument 'seq' in emulation.
The testsuite has an emulation of the common utility 'seq' that only
supported 2- and 3-argument forms. This commit adds support for the
1-argument form.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Russell Bryant <russell@ovn.org>
Ben Pfaff [Thu, 5 Oct 2017 06:38:43 +0000 (23:38 -0700)]
jsonrpc: Increment sequence number when connection actually made.
The purpose of the sequence number is to allow the client to figure out
when the connection status has changed. The significant event for the
client is when a connection completes, not when a connection attempt
starts. Thus, this commit changes the code to increment the sequence
number at completion, not at the attempt.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Russell Bryant <russell@ovn.org>
Mark Michelson [Fri, 20 Oct 2017 14:46:19 +0000 (09:46 -0500)]
Documentation: Add document describing RBAC
Role based access control is a relatively new addition to OVS/OVN, and
aside from the database documentation in ovn-sb(5), there is not much
explaining what RBAC is, how to use it, and the available roles. This
document remedies that situation.
It is hopeful that any new roles added will be added to this document in
the future.
Signed-off-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
William Tu [Sat, 14 Oct 2017 04:33:58 +0000 (21:33 -0700)]
ofproto-dpif-xlate: Fix truncate and native tunnnel
Previous commit a67b337dc281 breaks the truncate and native
tunnel testcase by removing the truncate flag. The patch fixes
it by putting it back. Reproduce the error by:
> make check-system-userspace TESTSUITEFLAGS='17'
Fixes: a67b337dc281 ("ofproto-dpif-xlate: Remove assertion for truncated") Cc: IWASE Yusuke <iwase.yusuke0@gmail.com> Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Gurucharan Shetty <guru@ovn.org>
On RHEL 7.4 (with iproute-3.10.0-87), a DHCP provided
ipv4 address has the "dynamic" keyword set. For e.g
"ip addr show breth0 | grep inet" shows:
inet 10.116.248.91/20 brd 10.116.255.255 scope global dynamic breth0
inet6 fe80::250:56ff:fea8:fdf0/64 scope link
The keyword "dynamic" (according to 'man ip-address') is only
used for ipv6, but in this case this is not true. Our current
code will skip the ipv4 address restoration because of this.
With this commit, we special case "dynamic" keyword to be valid
in case of ipv4.
Anand Kumar [Thu, 19 Oct 2017 20:26:17 +0000 (13:26 -0700)]
datapath-windows: Update OvsIPv4TunnelKey flags in geneve decap.
Currently, the OvsLookupFlow fails for the decap packet,
when the Geneve options are present in the packet as the OvsIPv4TunnelKey
flags are not set in the Geneve decap.
Set the OvsIPv4TunnelKey flags OVS_TNL_F_OAM and OVS_TNL_F_CRT_OPT
in OvsDecapGeneve based on the geneve header. Also set OVS_TNL_F_GENEVE_OPT
if the packet has geneve options.
Signed-off-by: Anand Kumar <kumaranand@vmware.com> Acked-by: Sairam Venugopal <vsairam@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@ovn.org> Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
When the logrotate script runs, and Open vSwitch is running as a non-root
user, the /var/log/openvswitch directory doesn't have other rx bits set.
This means the reopen attempt will fail with "permission denied", even though
the default logrotate configuration creates a new log file with the
appropriate attributes.
This change sets the r/x bits for other on /var/log/openvswitch
Ben Pfaff [Tue, 17 Oct 2017 23:51:42 +0000 (16:51 -0700)]
ovs-atomic: Add C++ compatible implementation.
G++ 5 does not implement the _Atomic keyword, which is part of C11 but not
C++11, so the existing <stdatomic.h> based atomic implementation doesn't
work. This commit adds a new implementation based on the C++11 <atomic>
header.
In this area, C++ is pickier about types than C, so a few of the
definitions in ovs-atomic.h have to be updated to use more precise types
for integer constants.
This updates the code that generates cxxtest.cc to #include <config.h>
(so that HAVE_ATOMIC is defined) and to automatically regenerate when the
program is reconfigured (because otherwise the #include <config.h>) won't
get added without a "make clean" step).
"ovs-atomic.h" is not a public header, but apparently some code was
using it anyway.
Fixes: 9c463631e8145 ("ovs-atomic: Report error for contradictory configuration.") Reported-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
IWASE Yusuke [Wed, 4 Oct 2017 13:54:16 +0000 (22:54 +0900)]
ofproto-dpif-xlate: Remove assertion for truncated
Because OpenFlow Spec does not clearly stipulate that "max_len" in
OUTPUT action must be zero when "port" is other than OFPP_CONTROLLER,
it is too strict assertion that confirm "max_len" is not zero, and
"max_len" should be ignored when not used.
Also this assertion causes the lack of the interoperability with some
controller implementations.
This patch removes these redundant assertions of if truncated or not.
Signed-off-by: IWASE Yusuke <iwase.yusuke0@gmail.com> Signed-off-by: Andy Zhou <azhou@ovn.org>
Ben Pfaff [Tue, 10 Oct 2017 14:58:59 +0000 (07:58 -0700)]
util: Make ovs_assert() always expand so that it should be followed by ;
ovs_assert() is normally invoked like a function call, e.g.:
ovs_assert(true);
but its expansion was a full statement, so that this ended up expanding to:
if (!OVS_LIKELY(true)) { \
ovs_assert_failure(OVS_SOURCE_LOCATOR, __func__, #CONDITION); \
};
with both } and ; at the end, which is weird and somewhat risky around 'if'
statements.
This commit fixes the problem, making ovs_assert() expand to an expression.
Reported-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Fri, 15 Sep 2017 00:01:18 +0000 (17:01 -0700)]
ofp-print: Avoid trailing white space in OpenFlow dumps.
ofp_to_string() sometimes yields a trailing space in its output. This is
annoying for the test infrastructure, since we have to specially mark the
trailing white space in Autotest with a "@&t@" marker at the end of the
line. This commit gets rid of the trailing white space and the annoying
"@&t@" markers.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Russell Bryant <russell@ovn.org>
Ben Pfaff [Fri, 15 Sep 2017 00:00:38 +0000 (17:00 -0700)]
util: Avoid trailing white space in hex dumps.
ovs_hex_dump() sometimes yields a trailing space in its output. This is
annoying for the test infrastructure, since we have to specially mark the
trailing white space in Autotest with a "@&t@" marker at the end of the
line. This commit gets rid of the trailing white space and the annoying
"@&t@" markers.
This also gets rid of an occasional trailing hyphen.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Russell Bryant <russell@ovn.org>
Ben Pfaff [Thu, 14 Sep 2017 22:31:48 +0000 (15:31 -0700)]
table: Avoid trailing white space in tables.
Commands that use the table library, such as ovs-vsctl and "ovsdb-client
dump", print trailing white space in tabular output, to fill out the entire
width of their tabular columns. This is annoying whenever we use these
commands in the test infrastructure, since we have to specially mark the
trailing white space in Autotest with a "@&t@" marker at the end of the
line. This commit gets rid of the trailing white space and the annoying
"@&t@" markers.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Russell Bryant <russell@ovn.org>
Ben Pfaff [Wed, 30 Aug 2017 16:43:11 +0000 (09:43 -0700)]
daemon-unix: With --monitor, only close standard fds if --detach also used.
Daemons generally should close the standard fds because they don't want to
hold open an SSH session, etc. that is attached to a tty. But --monitor
without --detach does not daemonize, so do not close fds in that case.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Russell Bryant <russell@ovn.org>
Ben Pfaff [Thu, 31 Aug 2017 22:09:13 +0000 (15:09 -0700)]
replication: Avoid theoretical use-after-free error in reset_database().
Code that calls ovsdb_txn_row_delete() should avoid referencing the
deleted row again, because it might be freed. In practice this shouldn't
really happen in this case because of the particular circumstances, but it
costs little to be careful.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Russell Bryant <russell@ovn.org>
We still use SysV scripts for RHEL. Currently, invoking
/etc/init.d/openvswitch will redirect the calls to
dynamically generated systemd scripts. In the above case when you call
"/etc/init.d/openvswitch-switch start", it inturn calls
"/bin/systemctl start openvswitch-switch.service" and
that inturn again calls "/etc/init.d/openvswitch-switch start"
This patch avoids it. This is similar to what was done to
Debian in commit 873d85653d8 (debian: Skip systemctl redirect.)
Signed-off-by: Gurucharan Shetty <guru@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
ovs-ctl.in: Call 'hostname -f' after vswitchd starts.
Currently we call 'hostname -f' when ovs-vswitchd is not
running. If you are using ovs-vswitchd to provide your
primary networking, then 'hostname -f' will "hang" till it
times out. On the system this issue was discovered, this was
as long as 40 seconds. This is a problem during OVS restarts
or upgrades.
This commit calls 'hostname -f' after ovs-vswitchd has started.
VMware-BZ: #1972026 Signed-off-by: Gurucharan Shetty <guru@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Fri, 29 Sep 2017 17:10:27 +0000 (10:10 -0700)]
ovs-atomic: Reintroduce atomic_uint64_t and atomic_int64_t.
This is essentially a revert of commit e09d61c41b4f ("ovs-atomic: Remove
atomic_uint64_t and atomic_int64_t.") My fear that some 32-bit platforms
did not support 64-bit integers seems overblown, because OVS 2.6.x uses
the 64-bit atomic_ullong and it is in Debian, which has tons of
architectures.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Simon Horman <simon.horman@netronome.com>
Iman Tabrizian [Mon, 2 Oct 2017 16:58:04 +0000 (20:28 +0330)]
Fix a typo in the controller name in the howto
This commit fixes potential unintended mistake in howto guide of
userspace tunneling.
Submitted-at: https://github.com/openvswitch/ovs/pull/209 Signed-off-by: Iman Tabrizian <tabrizian@outlook.com> Signed-off-by: Russell Bryant <russell@ovn.org>
Andy Zhou [Wed, 27 Sep 2017 23:53:45 +0000 (16:53 -0700)]
dpif-netdev: Use portable error code for zero rate meter band
'EBADRQC' is only defined on the Linux platform. Without this fix,
The travis MacOS build fails. Switching to using EDOM which is more
portable.
Fixes: 2029ce9ac3a601 (dpif-netdev: Fix a zero-rate bug for meter) CC: Ali Volkan ATLI <volkan.atli@argela.com.tr> Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
Ben Pfaff [Thu, 28 Sep 2017 16:27:25 +0000 (09:27 -0700)]
connmgr: Fix violation of flow monitoring protocol description.
nicira-ext.h says:
* 1. OVS sends an NXT_FLOW_MONITOR_PAUSED message to the controller, following
* all the already queued notifications. After it receives this message,
* the controller knows that its view of the flow table, as represented by
* flow monitor notifications, is incomplete.
The actual implementation could send NXT_FLOW_MONITOR_PAUSED in the middle
of a series of queued notifications. This fixes it to always send it after
those notifications. Possibly this confused some controllers, since the
documentation said that NXFME_ADD and NXFME_MODIFIED notifications wouldn't
be sent between "pause" and "resume" messages, but this bug could cause
them to be sent just after "pause".
VMware-BZ: #1919454 Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Gurucharan Shetty <guru@ovn.org>
Conntrack, Conntrack-related, Stt, and IP fragmentation
have cleaner threads that run periodically to clean
up their respective tables. During driver unload,
OvsExtDetach() calls into routines that are meant
for explicitly cleaning these tables up and freeing
the resources associated with these threads.
If during driver unload, these cleaner threads run
immediately after the resources are freed, such as locks
used by these threads, then the cleaner threads result
in a kernel crash since they try to acquire locks
that have already been freed.
For eg, OvsIpFragmentEntryCleaner() caused a kernel
crash because it tried to acquire a lock that was
already freed by OvsCleanupIpFragment().
The fix is to simply exit the cleaner thread if the
lock associated with the thread is not initialized,
because the only way the threads can run when the lock
is invalid is when the lock has been freed up during
driver unload.
Testing done:
Verified that cleaner threads run as expected without
crashing during driver unload.
Now that Patchwork 2.0 is out, folks can start to take advantage of some
of the new features that it offers. Chief among these is series support,
which is only exposed via the web UI and new REST API and which, in
turn, necessitates using git-pw rather than pwclient. As such, this tool
is slightly documented.
Signed-off-by: Stephen Finucane <stephen@that.guru> Signed-off-by: Ben Pfaff <blp@ovn.org>
Merge native tunnel handling with patch port handling
as much as possible.
Current native tunnel handling logic inspects the generated actions
to determine if truncate has been applied to the packet. (since if
it is then recirculation should be used). This logic can be
simplified by passing the 'truncate' boolean argument into
compose_output_action().
Signed-off-by: Andy Zhou <azhou@ovn.org> Tested-by: Greg Rose <gvrose8192@gmail.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Andy Zhou [Thu, 24 Aug 2017 01:48:49 +0000 (18:48 -0700)]
ofproto-dpif: Unfreeze within clone
When translating actions within open flow clone, actions generated
by finish_freezeing() should also be enclosed within the datapath
clone netlink encoding.
Signed-off-by: Andy Zhou <azhou@ovn.org> Tested-by: Greg Rose <gvrose8192@gmail.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com>
These dead assignment warnings do not affect functionality.
In one case, a local variable could be removed and in another
case, the working pointer should be used rather than the start
pointer.
Fixes: bd5e81a0e596 ("Userspace Datapath: Add ALG infra and FTP.") Reported-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-September/338515.html Acked-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com> Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Close a theoretical race delete/create corner case for alg
reverse conns and add debugging around this that may point to
an intentional exploit, unintentional problem or just a rare
condition. The solution is to keep track of reverse conn via
nat_conn_keys and avoid deleting the reverse conn when it has been
recreated.
Fixes: bd5e81a0e596 ("Userspace Datapath: Add ALG infra and FTP.") Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Given that it is libopenvswitch-dev not libopenvswitch that depends on
libssl-dev, this patch updates debian/control file to reflect that
libopenvswitch-dev depends on libssl-dev, and libopenvswitch depends
on openssl.
Greg Rose [Mon, 11 Sep 2017 21:11:06 +0000 (14:11 -0700)]
datapath: Fix up vxlan device flags
I missed a couple of usages of the flags parameter from vxlan_dev
while adding compatibility code to handle the removal of the flags.
Add the checks so that the module can compile for Linux kernel
release 4.13
Fixes: 143656435c ("datapath: get rid of redundant vxlan_dev.flags") Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Andy Zhou <azhou@ovn.org>
Greg Rose [Mon, 11 Sep 2017 21:11:03 +0000 (14:11 -0700)]
datapath: Fixup RTNL ops for kernel 4.13
The RTNL ops validate and newlink functions now take the extended
netlink ack parameter. Use the new HAVE_EXT_ACK_IN_RTNL_LINKOPS
define to check if the additional parameter is present and add the
parameter if so.
While in the modules remove the checks for Linux kernels < 2.3.39
since they are no longer supported since 2.5.x.
Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Andy Zhou <azhou@ovn.org>
openvswitch: fix skb_panic due to the incorrect actions attrlen
For sw_flow_actions, the actions_len only represents the kernel part's
size, and when we dump the actions to the userspace, we will do the
convertions, so it's true size may become bigger than the actions_len.
But unfortunately, for OVS_PACKET_ATTR_ACTIONS, we use the actions_len
to alloc the skbuff, so the user_skb's size may become insufficient and
oops will happen like this:
skbuff: skb_over_panic: text:ffffffff8148fabf len:1749 put:157 head: ffff881300f39000 data:ffff881300f39000 tail:0x6d5 end:0x6c0 dev:<NULL>
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:129!
[...]
Call Trace:
<IRQ>
[<ffffffff8148be82>] skb_put+0x43/0x44
[<ffffffff8148fabf>] skb_zerocopy+0x6c/0x1f4
[<ffffffffa0290d36>] queue_userspace_packet+0x3a3/0x448 [openvswitch]
[<ffffffffa0292023>] ovs_dp_upcall+0x30/0x5c [openvswitch]
[<ffffffffa028d435>] output_userspace+0x132/0x158 [openvswitch]
[<ffffffffa01e6890>] ? ip6_rcv_finish+0x74/0x77 [ipv6]
[<ffffffffa028e277>] do_execute_actions+0xcc1/0xdc8 [openvswitch]
[<ffffffffa028e3f2>] ovs_execute_actions+0x74/0x106 [openvswitch]
[<ffffffffa0292130>] ovs_dp_process_packet+0xe1/0xfd [openvswitch]
[<ffffffffa0292b77>] ? key_extract+0x63c/0x8d5 [openvswitch]
[<ffffffffa029848b>] ovs_vport_receive+0xa1/0xc3 [openvswitch]
[...]
Also we can find that the actions_len is much little than the orig_len:
crash> struct sw_flow_actions 0xffff8812f539d000
struct sw_flow_actions {
rcu = {
next = 0xffff8812f5398800,
func = 0xffffe3b00035db32
},
orig_len = 1384,
actions_len = 592,
actions = 0xffff8812f539d01c
}
So as a quick fix, use the orig_len instead of the actions_len to alloc
the user_skb.
Last, this oops happened on our system running a relative old kernel, but
the same risk still exists on the mainline, since we use the wrong
actions_len from the beginning.
Fixes: ccea74457bbd ("openvswitch: include datapath actions with sampled-pac Cc: Neil McKee <neil.mckee@inmon.com> Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Fixes: 0e469d3b380c ("datapath: Include datapath actions with sampled-packet upcall to userspace.") Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Andy Zhou <azhou@ovn.org>
openvswitch: Remove unnecessary newlines from OVS_NLERR uses
OVS_NLERR already adds a newline so these just add blank
lines to the logging.
Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Joe Stringer <joe@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Andy Zhou <azhou@ovn.org>
openvswitch: Optimize operations for OvS flow_stats.
When calling the flow_free() to free the flow, we call many times
(cpu_possible_mask, eg. 128 as default) cpumask_next(). That will
take up our CPU usage if we call the flow_free() frequently.
When we put all packets to userspace via upcall, and OvS will send
them back via netlink to ovs_packet_cmd_execute(will call flow_free).
The test topo is shown as below. VM01 sends TCP packets to VM02,
and OvS forward packtets. When testing, we use perf to report the
system performance.
VM01 --- OvS-VM --- VM02
Without this patch, perf-top show as below: The flow_free() is
3.02% CPU usage.
With this patch, the TCP throughput(we dont use Megaflow Cache
+ Microflow Cache) between VMs is 1.18Gbs/sec up to 1.30Gbs/sec
(maybe ~10% performance improve).
This patch adds cpumask struct, the cpu_used_mask stores the cpu_id
that the flow used. And we only check the flow_stats on the cpu we
used, and it is unncessary to check all possible cpu when getting,
cleaning, and updating the flow_stats. Adding the cpu_used_mask to
sw_flow struct doesâ\80\99t increase the cacheline number.
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: Andy Zhou <azhou@ovn.org>
openvswitch: Optimize updating for OvS flow_stats.
In the ovs_flow_stats_update(), we only use the node
var to alloc flow_stats struct. But this is not a
common case, it is unnecessary to call the numa_node_id()
everytime. This patch is not a bugfix, but there maybe
a small increase.
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: Andy Zhou <azhou@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
SKB_GSO_UDP is removed in the upstream kernel. Use HAVE_SKB_GSO_UDP
define from acinclude to detect if SKB_GSO_UDP exists and if so apply
openvswitch section of this upstream patch.
Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Andy Zhou <azhou@ovn.org>
netdev-dpdk: reset packet_type for reused dp_packets.
DPDK uses dp-packet pool for storing received packets. The pool is
reused by rxq_recv funcions of the DPDK netdevs. The datapath is
capable to modify the packet_type property of packets. For instance
when encapsulated L3 packets are received on a ptap gre port.
In this case the packet_type property of struct dp_packet can be
modified and later the same dp_packet with the modified packet_type
can be reused in the rxq_rec function, so it can contain corrupted
data.
The dp_packet_batch_init_cutlen() in the rxq_recv functions iterates
over dp_packets and sets their cutlen. So I modified this function
to set packet_type to Ethernet for the dp_packets as well. I also
renamed this function because of the added functionality.
The dp_packet_batch_init_cutlen() iterates over batch->count dp_packet.
Therefore setting of batch->count = nb_rx needs to be done before the
former function is invoked. This is an additional fix.
dpif-netdev: Fix comments for pmd_load_cached_ports.
Commit 57eebbb4c315 replaces thread local 'pmd->port_cache' with
'pmd->tnl_port_cache' and 'pmd->send_port_cache' maps. Update the
comments accordingly.
Fixes: 57eebbb4c315 ("Don't try to output on a device without txqs") Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com> Signed-off-by: Darrell Ball <dlu998@gmail.com>
The variable 'cnt' is initialized and reused in multiple function calls
inside netdev_dpdk_send__() and is confusing sometimes. Instead introduce
'batch_cnt' to hold the original packet count and 'tx_cnt' to store
the final packet count resulting after filtering and qos operations.
Finally 'tx_cnt' packets gets transmitted on the respective 'qid'.