netdev-linux: avoid negative value cast to non-negative in comparison
I am using a userspace OVS, there is a constant ERR message in log:
"dpif_netdev|ERR|error receiving data from ovs-netdev: Message too long"
Check the code and find there is a comparison between ssize_t and size_t type in
netdev_rx_linux_recv(). It makes the negative "retval" greater than the "size".
Change the sequence of comparison to avoid negative return value cast to
a positive and become greater then a non-negative value.
Signed-off-by: ZhengLingyun <konghuarukhr@163.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
ofproto-dpif: Make "ovs-appctl dpif/dump-flows" megaflow-aware.
The "ovs-appctl dpif/dump-flows" command wasn't updated to print
megaflows, so the output would not include wildcards even though the
datapath may, so the output was inconsistent and confusing.
ofproto-dpif: Expire fin_timeout actions when no previous timeout set.
Commit e503cc199 (ofproto: Optimise OpenFlow flow expiry) optimized
OpenFlow flow expiration by putting expirable flows on a list. However,
the list is only configured at rule creation time. If the rule is
created without a timeout, but is later set by the fin_timeout action,
it will never expire. This commit adds the rule to the list when the
action is triggered.
Signed-off-by: Justin Pettit <jpettit@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
ofproto-dpif: Don't put new subfacets as result of "userspace" action.
Don't install a flow if it's the result of the "userspace" action for an
already installed facet. This can occur when a datapath flow with
wildcards has a "userspace" action and flows sent to userspace result in
a different subfacet, which will then be rejected as overlapping by the
datapath.
Signed-off-by: Justin Pettit <jpettit@nicira.com> Co-authored-by: Ethan Jackson <ethan@nicira.com>
Ben Pfaff [Tue, 16 Jul 2013 17:43:07 +0000 (10:43 -0700)]
poll-loop: Simplify and speed up polling.
The simplification comes from dropping support for canceling a
poll_waiter, which was a feature that was never used. The speedup
comes from avoiding a malloc() for every call to poll_fd_wait().
(I doubt that this significantly improves performance.)
This prepares for making the polling structures per-thread in
the next commit.
datapath: Use masked flow when validating actions.
It is important to validate flow actions to ensure that they do
not try to write off the end of a packet. The mechanism to do this
is to ensure that a flow is precise enough to describe valid vs.
invalid packets and only allowing actions on valid flows.
The introduction of megaflows broke this by using a narrow base
flow but a potentially wide match. This meant that while the
original flow was properly validated, later packets might not
conform to that flow and could be truncated. This switches to
using the masked flow instead, effectively requiring that all
possible matching packets be valid in order for a flow's actions
to be accepted.
This change only affects the flow setup path - executed packets
have always used the flow extracted from the packet and therefore
were properly validated.
Ben Pfaff [Fri, 12 Jul 2013 16:39:15 +0000 (09:39 -0700)]
tests: Fix theoretical races in "ofproto-dpif - controller" test.
I don't see anything that guaranteed that ovs-ofctl would receive and print
the packets before it exited. This commit inserts explicit waits, to avoid
the problem.
hindex_next() make the completely wrong assumption that head nodes within
a bucket were sorted in ascending order by hash. This commit removes
that assumption.
Also add a test that would have found the problem.
Signed-off-by: ZhengLingyun <konghuarukhr@163.com>
[blp@nicira.com changed how hindex_head_node() is implemented and
other code details] Signed-off-by: Ben Pfaff <blp@nicira.com>
datapath: Revert "datapath: rhel: Account for RHEL specific backports"
This reverts commit 752378e1cd1f133a8366fbacec3b281a45ff8268
(datapath: rhel: Account for RHEL specific backports).
Change related to netif_needs_gso() is cuasing panic on RHEL
and Xen platforms. This way we revert back to use of ovs
skb_gso_segment() and netif_skb_features() which has required
compat code for gso for kernel older than 2.6.38.
Ben Pfaff [Wed, 10 Jul 2013 16:53:21 +0000 (09:53 -0700)]
stress: Remove essentially unused library.
The "stress" library was introduced years ago. We intended at the time to
start using it to provoke errors in testing, to make sure that Open vSwitch
was resilient against those errors. The intention was good, but there were
few actual implementations of stress options, and the testing never
materialized.
Rather than adapt the stress library for thread safety, this seems like a
good opportunity to remove it, so this commit does so.
ofproto-dpif: Zero-out subfacet counters when installation fails.
When deleting subfacets, subfacet_uninstall() asserts that the
subfacet's counters are zero to make sure we don't lose counters. We
have seen cases where a subfacet cannot be installed, but the counters
have values. This should not happen and indicates a bug, since we
shouldn't create a subfacet if the datapath has a flow. A buggy
datapath could trigger this, so just zero out the counters and log an
error.
Ben Pfaff [Fri, 12 Jul 2013 00:02:12 +0000 (17:02 -0700)]
timeval: Remove backtrace feature.
The backtrace feature of timeval is useful because it provides a "poor
man's profile" view of Open vSwitch. But it is not likely to be useful in
a multithreaded process, because signal delivery doesn't necessarily follow
the profile when there is more than one thread. (A signal in a
multithreaded process are delivered to an arbitrary thread.)
Another problem with the backtrace feature is that it is difficult for
format_backtraces() to synchronize properly with the signal handler in a
multithreaded process. In a single-threaded process, it can just block
the signal handler, but in a multithreaded process this does not prevent
signal delivery to threads other than the one running format_backtrace().
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ed Maste <emaste@freebsd.org>
ovs-ctl.in: Increase the limit on file descriptors.
Testing shows that creation of 5000 internal ports and using it
to do some meaningful tasks works fine on a 12 cpu hardware.
Since a single port needs one file descriptor and a bridge
needs 3 file descriptors, we will have to increase the file
descriptor limit to a higher number from the current limit of 5000.
7500 feels like a decent increase with enough room for further
scale testing.
Bug #18383. Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Tue, 9 Jul 2013 16:23:02 +0000 (09:23 -0700)]
dpif-netdev: Make "packet-out" with in_port=OFPP_CONTROLLER work again.
Commit 4e022ec09e14 (Create specific types for ofp and odp port) broke
OpenFlow OFPP_PACKET_OUT requests that use in_port=OFPP_CONTROLLER. This
commit fixes the problem and adds a regression test.
CC: Alex Wang <alexw@nicira.com> Reported-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Signed-off-by: Ben Pfaff <blp@nicira.com>
There is no default value for the tunnel TTL, so it must be
specified when setting up a new flow. However, the flow rejection
log message indicates that the TTL must be non-zero, which is not
true.
CC: Andy Zhou <azhou@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
datapath: Always include tunnel TTL in serialized Netlink attributes.
There is no default value for the tunnel TTL so it must always be
included in flow keys sent from userspace to kernel. The kernel
should also respect this convertion when sending flows to userspace
by always including the TTL in tunnel flows.
CC: Andy Zhou <azhou@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
datapath: Use DP_MAX_PORTS when no IN_PORT attribute is present.
To indicate that a flow is not associated with any particular in port,
userspace may omit the IN_PORT attribute, which the kernel translates
internally to the special value DP_MAX_PORTS. After the megaflows
changes, this was no longer being done, resulting in it using port 0
(the internal port).
This also adopts a wildcarding scheme similar to 802.2 packets where
a mask can be specified for this non-existent key attribute but it
must either be completely wildcarded or completely exact match.
CC: Andy Zhou <azhou@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
Ben Pfaff [Mon, 8 Jul 2013 17:47:37 +0000 (10:47 -0700)]
ovs-vsctl: Fix behavioral regression for "--if-exists del-port <bridge>".
Commit 89f3c258fe (ovs-vsctl: Improve error message for "ovs-vsctl del-port
<bridge>".) changed the behavior of
ovs-vsctl --if-exists del-port <bridge>
from a silent no-op to a hard failure. This commit fixes this regression.
This caused problems on XenServer, for which the Open vSwitch integration
runs commands like:
/usr/bin/ovs-vsctl --timeout=20 \
-- --with-iface --if-exists del-port xapi103 \
-- --if-exists del-br xapi103
Bug #18276. Reported-by: Michael Hu <mhu@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ethan Jackson [Sun, 7 Jul 2013 01:55:45 +0000 (18:55 -0700)]
utilities: New helper script ovs-dev.py.
ovs-dev.py is a script I've written to help perform common tasks
necessary for developing Open vSwitch. It allows a developer to
configure, build, and run the switch with a minimum of effort or
knowledge of the various idiosyncrasies involved.
Thomas Graf [Tue, 9 Jul 2013 18:36:57 +0000 (20:36 +0200)]
datapath: rhel: Move RHEL OVS hook registration to netdev_rx_handler_register() backport
Moves the registration of the RHEL specific OVS hook to the compat
backport of netdev_rx_handler_register(). This moves the hook
unregistration from the RCU callback to the netdev_destroy()
callback directly.
This is purely cosmetic though, the RHEL hook is only used if the
IFF_OVS_DATAPATH flag is present which was removed under RTNL
protection before the RCU callback.
Signed-off-by: Thomas Graf <tgraf@redhat.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
Ethan Jackson [Thu, 2 May 2013 22:22:19 +0000 (15:22 -0700)]
ofproto-dpif: Remove pointers between rules and facets.
Before this patch, facets maintained a pointer to the first rule
used when translating their actions, and rules maintained a pointer
to those facets. This made sense before the resubmit actions which
each facet used precisely one rule. However, today a facet may
require many rules to translate, and therefore it makes no
conceptual sense to designate one as the "owning rule".
Furthermore, as Open vSwitch becomes multithreaded, maintaining a
facet's rule pointer will become more difficult. One thread will
do the action translation, while another will maintain the facets.
During the hand-off between these threads, it's possible the
"owning rule" will expire leaving us with a stale pointer.
This patch does have a disadvantage, Pushing a facet's statistics
will become slightly less efficient as it will involve an
additional classifier lookup. We can revisit this issue once
multithreading is complete, but I suspect there's much lower
hanging fruit to worry about.
Ethan Jackson [Fri, 14 Jun 2013 01:38:24 +0000 (18:38 -0700)]
ofproto-dpif: Modularize ofproto-dpif-xlate.
This patch modularizes ofproto-dpif-xlate by disentangling it from
ofproto-dpif. Instead of poking around in ofproto-dpif's internal
data structures, ofproto-dpif-xlate is updated with a simple API
which can easily be made thread safe. There are still some places
where ofproto-dpif-xlate needs to call into ofproto-dpif, but this
patch makes significant progress towards the final goal.
Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ethan Jackson [Thu, 20 Jun 2013 20:00:27 +0000 (13:00 -0700)]
ofproto-dpif: Modularize mirror code.
This code modularizes ofproto-dpif's mirror code by moving it to
ofproto-dpif-mirror. Not only does this shorten ofproto-dpif and
hide complexity, but its also necessary for future patches which
modularize ofproto-dpif-xlate in preparation for multi-threading.
Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ethan Jackson [Thu, 20 Jun 2013 01:51:28 +0000 (18:51 -0700)]
ofproto-dpif: Handle dest mirrors in compose_output_action().
Before this patch, the mirroring code would retroactively insert
actions for destination mirrors after actions were translated.
This relied on converting datapath output actions into ofports
which doesn't work for tunnels and patch ports. This patch
refactors the code to handle destination mirrors at output.
Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ethan Jackson [Wed, 19 Jun 2013 21:40:21 +0000 (14:40 -0700)]
lacp: Handle unknown slaves in lacp_process_packet().
In future patches, ofproto-dpif-xlate may be temporarily out of
sync with ofproto-dpif proper, and pass an unknown ofport to
lacp_process_packet(). This patch handles that edge case
gracefully.
Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
bitmap: Fix bitmap_allocate1() bug when n_bits is a multiple of 32.
In function bitmap_allocate1(), the last "unsigned long" in bitmap was
set to all 0-bits when (n_bits % BITMAP_ULONG_BITS == 0). This commit
correctly sets it to all 1-bits.
Signed-off-by: ZhengLingyun <konghuarukhr@163.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Pavithra Ramesh [Sat, 29 Jun 2013 10:09:02 +0000 (10:09 +0000)]
BFD: Add forwarding_override command for BFD.
Added appctl commands to override the bfd forwarding status. Values
of true/false/normal are allowed. When set to normal, the bfd
forwarding status is left unchanged. Else, the forwarding status is
set to the specified value - true/false.
Signed-off-by: Pavithra Ramesh <paramesh@vmware.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ben Pfaff [Mon, 8 Jul 2013 17:15:00 +0000 (10:15 -0700)]
ofp-parse: Do not exit() upon a parse error.
Until now, failure to parse a flow in the ofp-parse module has caused the
program to abort immediately with a fatal error. This makes it hard to
use these functions from any long-lived program. This commit fixes the
problem.
Andy Zhou [Wed, 3 Jul 2013 16:18:27 +0000 (09:18 -0700)]
datapath: add netlink error message to help kernel userspace integration.
When kernel rejects a netlink message, it usually returns EINVAL
error code to the userspace. The actual reason for rejecting the
netlinke message is not available, making it harder to debug netlink
issues. This patch adds kernel log messages whenever a netlink message
is rejected with reasons. Those messages are logged at the info level.
Those messages are logged only once per message, to keep kernel log noise
level down. Reload the kernel module to re-enable already logged
messages.
The messages are meant to help developers to debug userspace and kernel
intergration issues. The actual message may change or be removed over time.
These messages are not expected to show up in a production environment.
Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
odp-util: Always encode mask of 0xffff for dl_type < ETH_TYPE_MIN.
For non-Ethernet II packets, we don't set an EtherType netlink attribute
and set the Ethertype mask attribute to 0xffff. The code was encoding
whatever mask was passed in, which could lead to bugs if the caller
didn't know the userspace-kernel interface.
Found by inspection.
Signed-off-by: Justin Pettit <jpettit@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Andy Zhou [Tue, 2 Jul 2013 22:58:19 +0000 (15:58 -0700)]
datapath: Fix tunnel source port selection for mega flow
Tunnel source port selection was based on hash value cached in the
flow. This no longer works with mega flow, since all flows matching
a mega flow will be transmitted with the same tunnel source port.
This patch computes the tunnel source port at run time based on each
incoming packet. Packets belong to the same micro flow would still get
the same source port, but multiple micro flows hitting the same mega flow
can get different source ports.
Packets injected from the usespace will be assigned to the same
source port as if they are forwarded in the kernel.
Bug #18216
Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
> commit 10a89ef04df5669c5cdd02f786150a7ab8454e01
> Author: Ben Pfaff <blp@nicira.com>
> Date: Mon Jun 24 10:54:49 2013 -0700
>
> Replace all uses of strerror() by ovs_strerror(), for thread safety.
>
> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Sat, 29 Jun 2013 15:18:54 +0000 (08:18 -0700)]
test-atomic: Drop atomic read-modify-write tests for the moment.
XenServer builds are failing because of link errors reporting that
__sync_fetch_and_<op>_<size> were not found, for <op> in add/sub/and/xor/or
and <size> in 1/2/4/8. We're not actually using these RMW operations yet,
so as a stopgap measure just drop the tests.
The correct long-term fix is probably to do Autoconf linkage tests for
these operations instead of just testing the GCC version (if we really need
the operations at all).
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>