Flavio Leitner [Tue, 14 Jan 2014 02:22:07 +0000 (00:22 -0200)]
rhel: Enable DHCP support for internal ports.
The current initscripts ifup-ovs brings up internal ports as
an ordinary ethernet device, so BOOTPROTO=dhcp|none does not
consider any OVS/bridge detail.
Since DHCP requires a port in the bridge to reach the server,
bring up the required port before in the same way it does for
OVS bridge.
Ben Pfaff [Fri, 10 Jan 2014 23:17:43 +0000 (15:17 -0800)]
ofproto-dpif: Un-wildcard nw_frag only for protocols that have fragments.
The revalidator code in ofproto-dpif-upcall.c, in revalidate_ukey(),
deletes any datapath flow for which the kernel reports wildcarded bits
that userspace requires to be matched. Until now, a couple of pieces of
code in ofproto-dpif always marked nw_frag (which tracks whether a packet
is an IPv4 or IPV6 fragment) as exact-match. For non-IP protocols, this
wasn't meaningful, so adding such a flow to the datapath and then receiving
it back caused nw_frag to become wildcarded, so revalidate_ukey() always
deleted them.
This fixes the problem by only un-wildcarding nw_frag for protocols where
it is defined (IPv4 and IPv6).
Noticed while observing CFM traffic (which isn't IP based) over a tunnel.
Reported-by: Guolin Yang <gyang@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Fri, 10 Jan 2014 23:14:27 +0000 (15:14 -0800)]
tunnel: Un-wildcard only flags that really exist in tnl_xlate_init().
The revalidator code in ofproto-dpif-upcall.c, in revalidate_ukey(),
deletes any datapath flow for which the kernel reports wildcarded bits
that userspace requires to be matched. Until now, tnl_xlate_init() marked
every bit in the tunnel flags as required to be matched. Since most of
those bits don't actually have defined flags, adding such a flow to the
datapath and then receiving it back caused those bits to become wildcarded,
which meant that revalidate_ukey() always deleted them.
This fixes the problem by only un-wildcarding defined flags.
Reported-by: Guolin Yang <gyang@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
handle_upcalls() always installed a flow for every packet, as long as
the datapath didn't already have too many flows, but there are cases where
we don't want to do this:
- If we get multiple packets in a single microflow all in one batch
(perhaps due to GSO breaking up a large TCP packet for sending to
userspace, or for another reason), then we only need to install the
datapath flow once.
- For a slow-pathed flow received via a slow-path action in the kernel,
we know that the kernel flow is already there (because otherwise it
would have been received as "no match" instead of an action), so
there is no benefit to reinstalling it.
Noticed because a CFM slow-pathed flow was getting reinstalled every time
a CFM packet was received.
Reported-by: Guolin Yang <gyang@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Tue, 31 Dec 2013 22:23:34 +0000 (14:23 -0800)]
Update build requirements.
Libtool is now required as of commit 38b7a52b61 (openvswitch: Use libtool
and allow building shared libs).
It seems that a build requirement for Python slipped in a while back, for
generating ovs-vswitchd.conf.db.5, and no one complained, so we might as
well make it official. (That will let us simplify some bits of the build,
too, since they won't have to be conditional on Python anymore, so I'm all
in favor of this change.)
Ben Pfaff [Tue, 31 Dec 2013 22:20:01 +0000 (14:20 -0800)]
configure: Make autoconf fail if libtool is not installed.
Otherwise users get an error later like:
./configure: line 5093: syntax error near unexpected token
`disable-shared' ./configure: line 5093: `LT_INIT(disable-shared)'
It's probably friendlier to make configuration fail earlier.
Flavio Leitner [Thu, 9 Jan 2014 03:04:33 +0000 (01:04 -0200)]
fedora package: fix systemd ordering and deps.
There is a chicken and egg issue where common OVS
configuration uses a controller which is only accessible
via the network. So starting OVS before network.target
would break those configurations.
However, the network doesn't come up after boot because
OVS isn't started until after the network scripts tries
to configure the ovs.
This is partially fixed by commits:
commit: 602453000e28ec1076c0482ce13c284765a84409
rhel: Automatically start openvswitch service before bringing an ovs interfa
But still there is the dependency issue between network.target
and openvswitch which this patch fixes it. It provides two systemd
service units. One to run at any time (openvswitch-nonetwork.service)
which runs 'ovs-ctl start' and the other one (openvswith.service) to
run after network.target which works as a frontend to the admin.
The openvswitch-nonetwork.service is used internally by the
'ifup-ovs/ifdown-ovs' scripts when adding or removing ports to
the bridge or when the openvswitch.service is enabled by the admin.
Alex Wang [Thu, 9 Jan 2014 02:51:43 +0000 (18:51 -0800)]
bfd: Fix cpath_down set failure.
Commit ccc09689 (bfd: Implement Bidirectional Forwarding Detection.)
set the bfd local diagnostic to "Concatenated Path Down" in response
to the set of cpath_down only when the current local diagnostic is
"None". However, since the bfd local diagnostic is not reset when
the bfd state is restored from last erroneous state, the set of
cpath_down will not update the local diagnostic in that case.
This commit fixes the bug by always checking for local diagnostic
change when cpath_down is set or reset.
Bug #22625 Signed-off-by: Alex Wang <alexw@nicira.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ben Pfaff [Wed, 8 Jan 2014 22:37:13 +0000 (14:37 -0800)]
dpif-netdev: Break actions out into new struct dp_netdev_actions.
This is analogous to the split between rule and rule_actions in
ofproto. As there, it will allow retaining a reference to a rule's
actions, while processing them, without having to retain a reference
to the rule itself.
Ben Pfaff [Sat, 28 Dec 2013 03:39:24 +0000 (19:39 -0800)]
ovs-atomic: Introduce a new 'struct ovs_refcount'.
This is a thin wrapper around an atomic_uint. It is useful anyhow because
each ovs_refcount_ref() or ovs_refcount_unref() call saves a few lines of
code.
This commit also changes all the potential direct users over to use the new
data structure.
Ben Pfaff [Thu, 9 Jan 2014 01:13:28 +0000 (17:13 -0800)]
ovs-atomic: New functions atomic_flag_init(), atomic_flag_destroy().
Standard C11 doesn't need these functions because it is able to require
implementations not to need them. But we can't construct a portable
implementation that does not need them in every case, so this commit adds
them.
These functions are only needed for atomic_flag objects that are
dynamically allocated (because statically allocated objects can use
ATOMIC_FLAG_INIT). So far there aren't any of those, but an upcoming
commit will introduce one.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ben Pfaff [Mon, 23 Dec 2013 23:46:22 +0000 (15:46 -0800)]
dpif-netdev: Remove max_mtu tracking.
Normally all the ports have the same mtu anyhow, so there is little
advantage in keeping track of the maximum mtu on a per-bridge basis. In
upcoming commits, tracking mtu will require more locking and present
even less advantage (because the packet buffer will become per-thread, so
that reallocating once per thread becomes essentially a null cost).
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ben Pfaff [Wed, 25 Dec 2013 00:08:57 +0000 (16:08 -0800)]
dpif-netdev: Use hmap instead of list+array for tracking ports.
The goal is to make it easy to divide the ports into groups for handling
by threads. It seems easy enough to do that by hash value, and a little
harder otherwise.
This commit has the side effect of raising the maximum number of ports from
256 to UINT32_MAX-1. That is why some tests need to be updated:
previously, internally generated port names like "ovs_vxlan_4341" were
ignored because 4341 is bigger than the previous limit of 256.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ben Pfaff [Mon, 23 Dec 2013 22:04:13 +0000 (14:04 -0800)]
dpif-netdev: Use new "ovsthread_counter" to track dp statistics.
ovsthread_counter is an abstract interface that could be implemented
different ways. The initial implementation is simple but less than
optimally efficient.
Ben Pfaff [Thu, 26 Dec 2013 06:27:25 +0000 (22:27 -0800)]
netdev-dummy: Make netdev_rx_wait() wakeups work cross-thread for dummies.
Until now, netdev_dummy_rx_wait() has only checked whether the receive
queue for the dummy device is currently empty. This has worked OK because
in practice packets were queued to dummy devices only from the same thread
that attempted to receive them. An upcoming commit will use different
threads for these purposes, so this commit switches to a notification
method that works cross-thread.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Alex Wang [Thu, 9 Jan 2014 00:52:12 +0000 (16:52 -0800)]
ofproto/trace: Use final flow to compute "Relevant fields".
Commit bcd2633a (ofproto-dpif: Store relevant fields for wildcarding
in facet.) introduced the bug that uses original input flow as final
flow to compute the "Relevant fields" in ofproto/trace ouput. This
commit fixes this bug.
Signed-off-by: Alex Wang <alexw@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Tue, 7 Jan 2014 08:17:25 +0000 (00:17 -0800)]
ofproto-dpif: Fix a vlan-splinter megaflow bug
When vlan-splinter is enabled, ovs receives non-vlan flows from the
kernel vlan ports, vlan tag is then added to the incoming flow before
xlating, so that they look like those received from a trunk port.
In case megaflow is enabled, xlating may set vlan masks during rule
processing as usual. If those vlan masks were serialized and downloaded
to the kernel (this bug), those mega flows will be rejected due to
unexpected vlan mask encapsulation, since the original kernel flows do
not have vlan tags. This bug does not break connectivity, but impacts
performance since all traffic received on vlan splinter ports will now
be handled by vswitchd, as no datapath flows can be successfully
installed.
This fix is to make sure no vlan mask encapsulation is generated for
the datapath flow if its in_port was re-written by vlan-splinter
receiving logic.
Simon Horman [Tue, 7 Jan 2014 04:48:08 +0000 (13:48 +0900)]
ofproto-dpif-xlate: Correct check for MPLS LSE
zero is a valid MPLS LSE so it is not valid check against
that value for MPLS LSE presence. Instead, check against
the flow's dl_type which should be an MPLS type if an LSE is present.
This problem appears to have been introduced by b2dd70be133bf86c ("Native Set-Field action.").
Cc: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Tue, 31 Dec 2013 19:32:16 +0000 (11:32 -0800)]
odp-util: Avoid null dereference in parse_8021q_onward().
For parsing a mask, this code in parse_8021q_onward() always read out
the OVS_KEY_ATTR_VLAN attribute without first checking whether it existed.
The correct behavior, implemented by this commit, appears to be treating
the VLAN as wildcarded and to continue parsing the flow.
Jarno Rajahalme [Mon, 30 Dec 2013 23:42:36 +0000 (15:42 -0800)]
tests: Make some tests more robust.
These tests break if OVS internal hash function is changed. Some of
this is due to dependency on the order in which elements are iterated
from hash maps, or the algorithm used is just dependent on the
specific hash values produced for specific inputs (groups). These
changes make these test cases more robust, so that they will not break
so easily due to OVS internal hash function implementation changes.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Alex Wang [Mon, 30 Dec 2013 23:15:52 +0000 (15:15 -0800)]
bfd: Make bfd decay test robust.
With ovs multithreading implementation, the bfd decay test
becomes fragile due to its high dependency on timing sequence.
This commit removes these dependencies and makes the test robust.
Signed-off-by: Alex Wang <alexw@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Wed, 4 Dec 2013 23:46:55 +0000 (15:46 -0800)]
netdev-linux: Simplify get_stats_via_netlink().
There's no need to obtain the ifindex, because RTM_GETLINK is happy to take
the interface name. There's no need to do a full nl_policy_parse(),
because we only need a single attribute.
Ben Pfaff [Wed, 4 Dec 2013 23:43:31 +0000 (15:43 -0800)]
netdev-linux: Drop support for pre-2.6.19 kernels.
The OVS kernel module requires 2.6.32 or later, so there's no reason for
userspace to support older kernels. This commit removes the special
fallback code for retrieving Linux netdev stats in pre-2.6.19 kernels,
which should no longer be useful.
Alex Wang [Fri, 20 Dec 2013 23:12:58 +0000 (15:12 -0800)]
ofproto-dpif-monitor: Remove monitor_init().
Commit 881d47a9fa9 (monitor: Replace monitor_seq with periodic
wakeup.) removes the global "struct seq" in ofproto-dpif-monitor
module. This change makes the monitor_init() no longer needed.
This commit removes the monitor_init() from ofproto-dpif-monitor.c.
Signed-off-by: Alex Wang <alexw@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Mon, 30 Dec 2013 19:35:41 +0000 (11:35 -0800)]
tests: Remove \r from source tree.
An ovsdb-server test had a literal carriage return in a check that
validates a directory name. It isn't really necessary (who puts a carriage
return in a directory name?) and it does cause problems for passing around
patches via email, so just delete it.
CC: Arun Sharma <arun.sharma@calsoftinc.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Tue, 24 Dec 2013 16:37:32 +0000 (08:37 -0800)]
Makefile.am: Always use C locale for "sort" and "comm".
Otherwise, if the user changes locales between running the "dist-hook-git"
and "distfiles" targets (e.g. in different invocations of "make"), then
the "dist-hook-git" target might falsely report that the distribution is
missing files.
Reported-by: John Darrington <john@darrington.wattle.id.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Joe Stringer [Wed, 25 Dec 2013 00:50:53 +0000 (16:50 -0800)]
bfd: Notify connectivity_seq on rmt_state changes.
The bfd module did not previously change the global connectivity_seq
when the remote state changed, which means that such state changes may
not be propagated to the database. This is particularly bad if this is
the last state transition to happen in an otherwise stable environment.
This patch checks for transitions in remote state, and ensures that the
main thread will update the database when these happen.
Bug #22136.
Co-authored-by: Alex Wang <alexw@nicira.com> Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
ofproto-dpif: Verbosity option for dpif/dump-flows command.
The display of port names instead of port number for in_port
is considered useful. Enabling the verbosity option also lets
you see all the wildcarded fields and can be helpful.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
YAMAMOTO Takashi [Tue, 24 Dec 2013 01:04:08 +0000 (10:04 +0900)]
ofproto: Avoid leaving a broken def
On errors, don't leave a broken ipfix-entries.def, which might cause
mysterious errors later.
(Probably the most common cause is the lack of python xml libraries.)
Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Signed-off-by: Ben Pfaff <blp@nicira.com>
Chris Luke [Sun, 22 Dec 2013 22:43:33 +0000 (14:43 -0800)]
datapath: bug.h missing from distfiles
commit 7c359202 introduced datapath/linux/compat/include/bug.h
but did not include it in datapath/linux/Modules.mk, which results
in the following build error:
> The distribution is missing the following files:
> datapath/linux/compat/include/linux/bug.h
Signed-off-by: Chris Luke <chris_luke@cable.comcast.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
Andy Zhou [Sat, 21 Dec 2013 00:18:58 +0000 (16:18 -0800)]
datapath: Fix sparse warning on BUILD_BUG_ON_INVALID()
Sparse gives the following warnings when compile against Linux kernel
3.5:
CHECK /root/projs/ovs/openvswitch/datapath/linux/skbuff-openvswitch.c
include/linux/mm.h:405:9: error: undefined identifier
'BUILD_BUG_ON_INVALID'
include/linux/mm.h:405:9: error: not a function <noident>
The same issue may also exist in kernel 3.6.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Alex Wang [Fri, 20 Dec 2013 22:53:52 +0000 (14:53 -0800)]
bfd: Send FINAL immediately after receiving POLL.
Commit 307464a11 (ofproto-dpif-monitor: Use heap to order the mport
wakeup time.) makes bfd only send packet at specified periodic instant.
This fails to meet the RFC5880 requirement, which requires bfd send
FINAL immediately after receiving POLL.
This commit fixes the above issue by scheduling bfd to send FINAL
within 100 ms after receiving POLL.
Thomas Graf [Thu, 19 Dec 2013 15:20:42 +0000 (16:20 +0100)]
linux: Report supported user features to the kernel
Following commit (''netlink: Do not enforce alignment of last Netlink
attribute''), signal the ability to receive unaligned Netlink messages
to the datapath to enable utilization of zerocopy optimizations.
Opening a datapath is now done by issueing a OVS_DP_CMD_SET in order
to overwrite previously set user features.
Signed-off-by: Thomas Graf <tgraf@redhat.com> Acked-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
ovs-check-dead-ifs: Flush buffer before calling execvp.
According to Python documentation here for execvp:
http://docs.python.org/2/library/os.html
"The current process is replaced immediately. Open file objects
and descriptors are not flushed, so if there may be data buffered
on these open files, you should flush them using sys.stdout.flush()
or os.fsync() before calling an exec* function.
Without the flush, we will miss the print statements before that
if we redirect the o/p to a file.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Fri, 20 Dec 2013 16:39:27 +0000 (08:39 -0800)]
ofp-print: Print durations with at least three decimals.
Occasionally I run a command like this:
watch -n.1 ovs-ofctl dump-flows br0
to see how flows change over time. Until now, it has been more difficult
than necessary to spot real changes, because flows "jump around" as the
number of decimals printed for duration changes from moment to moment.
That is, you might see
cookie=0x0, duration=4.566s, table=0, n_packets=0, ...
one moment, and then
cookie=0x0, duration=4.8s, table=0, n_packets=0, ...
the next moment. Shortening 4.8 to 4.800 shifts everything following it
two places to the left, creating a visual jump.
This commit avoids that problem by always printing at least three decimals
if we print any. There can still be an occasional jump if a duration is
exactly on a second boundary, but that only happens 1/1000 of the time.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Jarno Rajahalme [Fri, 20 Dec 2013 16:16:31 +0000 (08:16 -0800)]
lib/flow: Skip minimask value checks.
We allow zero 'values' in a miniflow for it to have the same map
as the corresponding minimask. Minimasks themselves never have
zero data values, though. Document this and optimize the code
accordingly.
v2:
- Made miniflow_get_map_in_range() to return data offset instead of
a pointer via the last parameter.
- Simplified minimatch_hash_in_range() by removing pointer arithmetic.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
YAMAMOTO Takashi [Fri, 20 Dec 2013 10:31:06 +0000 (19:31 +0900)]
tests/learn.at: Workaround a race
This test seems to assume that the switch completes
processing of the first packet before start processing
the second one. I don't see any code ensuring that.
Workaround the problem by giving 1 second for the upcall.
Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Signed-off-by: Ben Pfaff <blp@nicira.com>
YAMAMOTO Takashi [Fri, 20 Dec 2013 10:31:05 +0000 (19:31 +0900)]
timeval: Workaround for threaded test failures
BFD tests have the code like the following.
# wait for a while to stablize everything.
for i in `seq 0 9`; do ovs-appctl time/warp 500; done
They no longer work as intended because BFD code is run in a
separate monitor thread these days. The loop merely "warp"
the time by 5000. The monitor thread should have been woken
at least once, but it's far from "wait for a while to stablize
everything."
This commit mitigates the problem by sleeping a little in the
appctl handler. This is not ideal but makes BFD tests success
on my environment.
Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Signed-off-by: Ben Pfaff <blp@nicira.com>
YAMAMOTO Takashi [Fri, 20 Dec 2013 10:31:04 +0000 (19:31 +0900)]
tests/ofproto-dpif.at: Workaround a race
This test seems to assume only the first packets in flows
counted as 'miss'. I don't see any code ensuring that.
The test would fail if the upcall handler for the flow doesn't
run fast enough. Workaround the problem by giving 1 second
for the miss upcall.
Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ethan Jackson [Tue, 24 Sep 2013 20:39:56 +0000 (13:39 -0700)]
ofproto: Handle flow installation and eviction in upcall.
This patch moves flow installation and eviction from ofproto-dpif and
the main thread, into ofproto-dpif-upcall. This performs
significantly better (approximately 2x TCP_CRR improvement), and
allows ovs-vswitchd to maintain significantly larger datapath flow
tables. On top of that, it significantly simplifies the code,
retiring "struct facet" and friends.
Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Joe Stringer [Wed, 20 Nov 2013 22:25:43 +0000 (14:25 -0800)]
unixctl: Make dpif/dump-flows fetch kernel flows.
Previously we used facets for ovs-appctl dpif/dump-flows commands.
This switches to fetching flows directly from the dpif. This is
necessary because future patches remove facets and subfacet entirely.
Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Pravin B Shelar [Wed, 18 Dec 2013 18:57:33 +0000 (10:57 -0800)]
datapath: Fix build failure on RHEL 6.4
Patch fixes following build failure:-
make[4]: Entering directory
`/usr/src/kernels/2.6.32-358.18.1.el6.x86_64'
CC [M] openvswitch/datapath/linux/actions.o
In file included from
openvswitch/datapath/linux/actions.c:21:
openvswitch/datapath/linux/compat/include/linux/skbuff.h:273:
error: redefinition of ‘__skb_fill_page_desc’
include/linux/skbuff.h:1123: note: previous definition of
‘__skb_fill_page_desc’ was here
-----
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
Ben Pfaff [Wed, 18 Dec 2013 21:47:16 +0000 (13:47 -0800)]
ofp-tcpdump: Fix tcpdump patch breakage due to libtool.
The recently introduced use of libtool, in commit 38b7a52b618b98
(openvswitch: Use libtool and allow building shared libs) broke the
tcpdump patch. This fixes the problem.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Justin Pettit <jpettit@nicira.com>