datapath: list: Fix double fetch of pointer in hlist_entry_safe()
Following patch backports commit f65846a1800ef8c48d (list: Fix double
fetch of pointer in hlist_entry_safe()) from upstream kernel.
This patch fixes following panic. Thanks to Jesse for helping to
debug this issue.
list: Fix double fetch of pointer in hlist_entry_safe()
The current version of hlist_entry_safe() fetches the pointer twice,
once to test for NULL and the other to compute the offset back to the
enclosing structure. This is OK for normal lock-based use because in
that case, the pointer cannot change. However, when the pointer is
protected by RCU (as in "rcu_dereference(p)"), then the pointer can
change at any time. This use case can result in the following sequence
of events:
1. CPU 0 invokes hlist_entry_safe(), fetches the RCU-protected
pointer as sees that it is non-NULL.
2. CPU 1 invokes hlist_del_rcu(), deleting the entry that CPU 0
just fetched a pointer to. Because this is the last entry
in the list, the pointer fetched by CPU 0 is now NULL.
3. CPU 0 refetches the pointer, obtains NULL, and then gets a
NULL-pointer crash.
This commit therefore applies gcc's "({ })" statement expression to
create a temporary variable so that the specified pointer is fetched
only once, avoiding the above sequence of events. Please note that
it is the caller's responsibility to use rcu_dereference() as needed.
This allows RCU-protected uses to work correctly without imposing
any additional overhead on the non-RCU case.
Many thanks to Eric Dumazet for spotting root cause!
Reported-by: CAI Qian <caiqian@redhat.com> Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Li Zefan <lizefan@huawei.com>
Bug #17099
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
Ben Pfaff [Wed, 17 Jul 2013 22:53:26 +0000 (15:53 -0700)]
ofp-util: Fix port and queue stat counting for OpenFlow 1.3.
OpenFlow 1.0, 1.1, and 1.2 all have the same struct size for port and
queue stats. OpenFlow 1.3 has larger structs, but the ofp-util code didn't
realize that. This fixes the problem.
It appears that the only consequence of this problem would have been
printing the wrong count in ofp-print output.
datapath: Use correct type while allocating flex array.
Flex array is used to allocate hash buckets which is type struct
hlist_head, but we use `struct hlist_head *` to calculate
array size. Since hlist_head is of size pointer it works fine.
Following patch use correct type.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
Ben Pfaff [Thu, 25 Jul 2013 16:45:43 +0000 (09:45 -0700)]
util: Set thread name via pthreads in set_subprogram_name().
This makes "top" and "ps" output more readable on FreeBSD at least, and
the names are also visible in debuggers.
Suggested-by: Ed Maste <emaste@freebsd.org> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com> Tested-by: Andy Zhou <azhou@nicira.com>
ovs-bugtool: Separate capability for general network info.
Current situation is that CAP_NETWORK_STATUS has a max size of 50 MB.
When we have around 100,000 openflow flows, we over-run that size
by just running the "ovs-ofctl dump-flows" command. All the openvswitch
commands run through the plugin scripts in this repo won't have its
data stored in the debug bundle in this case as they are part of
CAP_NETWORK_STATUS too. One option to correct this is to increase
the CAP_NETWORK_STATUS max size to a higher number. But CAP_NETWORK_STATUS
also includes a bunch of general network related information collected
by running commands like ethtool, tc etc. and we probably want to limit
the data collected through those commands.
With this commit, we create a new capability called CAP_NETWORK_INFO
and collect general network related information through them. For OVS
related information, we continue to use CAP_NETWORK_STATUS, but remove
the maximum size restriction. One rationale to keep OVS related
information in CAP_NETWORK_STATUS is because xen-bugtool probably expects
OVS information in that capability.
Alex Wang [Tue, 23 Jul 2013 01:15:49 +0000 (18:15 -0700)]
vlan-splinter: Fix inverted logic bug.
When "other-config:enable-vlan-splinters=true" is set, the existing
vlans with ip address must be retained. The bug actually does the
opposite and retains the vlans without ip address. This commit fixes
it.
Reported-by: Roman Sokolkov <rsokolkov@gmail.com> Signed-off-by: Alex Wang <alexw@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Mon, 22 Jul 2013 22:08:00 +0000 (15:08 -0700)]
datapath: remove RCU annotation from flow->mask
After a mask is assigned to a flow, it will not change for the life of
the flow. Since flow access is protected by RCU lock, access to
flow->mask after getting a flow is always safe.
Suggested-by: Jesse Gross <jesse@nicira.com> Reported-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
Alex Wang [Mon, 22 Jul 2013 16:19:57 +0000 (09:19 -0700)]
clang: Fix the "expression result unused" warning.
This commit makes macro function "ASSIGN_CONTAINER()" evaluates
to "(void)0". This is to avoid the 'clang' warning: "expression
result unused", since most of time, the final evaluated value
is not used.
Signed-off-by: Alex Wang <alexw@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Alex Wang [Mon, 22 Jul 2013 16:19:56 +0000 (09:19 -0700)]
clang: Fix segfault in unit tests.
It seems that 'clang' compiler applies strict protection on pointer
dereference. And it causes unexpected execution in macro functions
like "HMAP_FOR_EACH()" and unit test failures. This commit fixes
this issue and pass all unit tests.
Co-authored-by: Ethan Jackson <ethan@nicira.com> Signed-off-by: Alex Wang <alexw@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
datapath: Conditionally define skb_unclone in datapath compat code
Recent versions of Fedora have skb_unclone included in their
kernels. This patch adds a conditional check into the compat directory so as
not to error out by defining it twice. This allows the latest OVS kernel
module to build on Fedora 19.
netdev-linux: avoid negative value cast to non-negative in comparison
I am using a userspace OVS, there is a constant ERR message in log:
"dpif_netdev|ERR|error receiving data from ovs-netdev: Message too long"
Check the code and find there is a comparison between ssize_t and size_t type in
netdev_rx_linux_recv(). It makes the negative "retval" greater than the "size".
Change the sequence of comparison to avoid negative return value cast to
a positive and become greater then a non-negative value.
Signed-off-by: ZhengLingyun <konghuarukhr@163.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
ofproto-dpif: Make "ovs-appctl dpif/dump-flows" megaflow-aware.
The "ovs-appctl dpif/dump-flows" command wasn't updated to print
megaflows, so the output would not include wildcards even though the
datapath may, so the output was inconsistent and confusing.
ofproto-dpif: Expire fin_timeout actions when no previous timeout set.
Commit e503cc199 (ofproto: Optimise OpenFlow flow expiry) optimized
OpenFlow flow expiration by putting expirable flows on a list. However,
the list is only configured at rule creation time. If the rule is
created without a timeout, but is later set by the fin_timeout action,
it will never expire. This commit adds the rule to the list when the
action is triggered.
Signed-off-by: Justin Pettit <jpettit@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
ofproto-dpif: Don't put new subfacets as result of "userspace" action.
Don't install a flow if it's the result of the "userspace" action for an
already installed facet. This can occur when a datapath flow with
wildcards has a "userspace" action and flows sent to userspace result in
a different subfacet, which will then be rejected as overlapping by the
datapath.
Signed-off-by: Justin Pettit <jpettit@nicira.com> Co-authored-by: Ethan Jackson <ethan@nicira.com>
Ben Pfaff [Tue, 16 Jul 2013 17:43:07 +0000 (10:43 -0700)]
poll-loop: Simplify and speed up polling.
The simplification comes from dropping support for canceling a
poll_waiter, which was a feature that was never used. The speedup
comes from avoiding a malloc() for every call to poll_fd_wait().
(I doubt that this significantly improves performance.)
This prepares for making the polling structures per-thread in
the next commit.
datapath: Use masked flow when validating actions.
It is important to validate flow actions to ensure that they do
not try to write off the end of a packet. The mechanism to do this
is to ensure that a flow is precise enough to describe valid vs.
invalid packets and only allowing actions on valid flows.
The introduction of megaflows broke this by using a narrow base
flow but a potentially wide match. This meant that while the
original flow was properly validated, later packets might not
conform to that flow and could be truncated. This switches to
using the masked flow instead, effectively requiring that all
possible matching packets be valid in order for a flow's actions
to be accepted.
This change only affects the flow setup path - executed packets
have always used the flow extracted from the packet and therefore
were properly validated.
Ben Pfaff [Fri, 12 Jul 2013 16:39:15 +0000 (09:39 -0700)]
tests: Fix theoretical races in "ofproto-dpif - controller" test.
I don't see anything that guaranteed that ovs-ofctl would receive and print
the packets before it exited. This commit inserts explicit waits, to avoid
the problem.
hindex_next() make the completely wrong assumption that head nodes within
a bucket were sorted in ascending order by hash. This commit removes
that assumption.
Also add a test that would have found the problem.
Signed-off-by: ZhengLingyun <konghuarukhr@163.com>
[blp@nicira.com changed how hindex_head_node() is implemented and
other code details] Signed-off-by: Ben Pfaff <blp@nicira.com>
datapath: Revert "datapath: rhel: Account for RHEL specific backports"
This reverts commit 752378e1cd1f133a8366fbacec3b281a45ff8268
(datapath: rhel: Account for RHEL specific backports).
Change related to netif_needs_gso() is cuasing panic on RHEL
and Xen platforms. This way we revert back to use of ovs
skb_gso_segment() and netif_skb_features() which has required
compat code for gso for kernel older than 2.6.38.
Ben Pfaff [Wed, 10 Jul 2013 16:53:21 +0000 (09:53 -0700)]
stress: Remove essentially unused library.
The "stress" library was introduced years ago. We intended at the time to
start using it to provoke errors in testing, to make sure that Open vSwitch
was resilient against those errors. The intention was good, but there were
few actual implementations of stress options, and the testing never
materialized.
Rather than adapt the stress library for thread safety, this seems like a
good opportunity to remove it, so this commit does so.
ofproto-dpif: Zero-out subfacet counters when installation fails.
When deleting subfacets, subfacet_uninstall() asserts that the
subfacet's counters are zero to make sure we don't lose counters. We
have seen cases where a subfacet cannot be installed, but the counters
have values. This should not happen and indicates a bug, since we
shouldn't create a subfacet if the datapath has a flow. A buggy
datapath could trigger this, so just zero out the counters and log an
error.