Ansis Atteka [Tue, 18 Feb 2014 21:19:36 +0000 (13:19 -0800)]
ovs-vsctl: reconnect to the database if connection was dropped
If ovs-vsctl has to wait for ovs-vswitchd to reconfigure itself
according to the new database, then sometimes ovs-vsctl could
end up stuck in the event loop if OVSDB connection was dropped
while ovs-vsctl was still running.
This patch fixes this problem by letting ovs-vsctl to reconnect
to the OVSDB, if it has to wait cur_cfg field to be updated.
Joe Stringer [Wed, 19 Feb 2014 18:26:31 +0000 (10:26 -0800)]
ofproto: Remove 'force-miss-model' configuration.
This configuration item was introduced to assist testing of upcall
handling behaviour with and without facets. Facets were removed in
commit e79a6c833e0d7237, so this patch removes the configuration item.
Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Simon Horman [Wed, 19 Feb 2014 12:27:16 +0000 (21:27 +0900)]
ofp-actions: Correct pop MPLS ethtype as consistency test
Correct pop MPLS ethtype consistency check to verify that
the packet has an MPLS ethtype before the pop action rather than after:
an MPLS ethtype is a pre-condition but not a post-condition of pop MPLS.
With this change the consistency check in ofpact_check__()
becomes consistent with that in ofpact_from_nxast().
This was found using Ryu tests via the new make check-ryu target.
It allows all of the "POP_MPLS"[1] tests to pass where they previously
failed.
Jarno Rajahalme [Tue, 18 Feb 2014 17:07:03 +0000 (09:07 -0800)]
datapath: Per NUMA node flow stats.
Keep kernel flow stats for each NUMA node rather than each (logical)
CPU. This avoids using the per-CPU allocator and removes most of the
kernel-side OVS locking overhead otherwise on the top of perf reports
and allows OVS to scale better with higher number of threads.
With 9 handlers and 4 revalidators netperf TCP_CRR test flow setup
rate doubles on a server with two hyper-threaded physical CPUs (16
logical cores each) compared to the current OVS master. Tested with
non-trivial flow table with a TCP port match rule forcing all new
connections with unique port numbers to OVS userspace. The IP
addresses are still wildcarded, so the kernel flows are not considered
as exact match 5-tuple flows. This type of flows can be expected to
appear in large numbers as the result of more effective wildcarding
made possible by improvements in OVS userspace flow classifier.
There is a small increase in kernel spinlock overhead due to the same
spinlock being shared between multiple cores of the same physical CPU,
but that is barely visible in the netperf TCP_CRR test performance
(maybe ~1% performance drop, hard to tell exactly due to variance in
the test results), when testing for kernel module throughput (with no
userspace activity, handful of kernel flows).
On flow setup, a single stats instance is allocated (for the NUMA node
0). As CPUs from multiple NUMA nodes start updating stats, new
NUMA-node specific stats instances are allocated. This allocation on
the packet processing code path is made to never block or look for
emergency memory pools, minimizing the allocation latency. If the
allocation fails, the existing preallocated stats instance is used.
Also, if only CPUs from one NUMA-node are updating the preallocated
stats instance, no additional stats instances are allocated. This
eliminates the need to pre-allocate stats instances that will not be
used, also relieving the stats reader from the burden of reading stats
that are never used.
Ben Pfaff [Fri, 14 Feb 2014 18:34:58 +0000 (10:34 -0800)]
tests: Add support for automatically running Ryu tests against OVS.
The Ryu controller comes with an extensive library of OpenFlow tests, but
it doesn't seem so easy to me to run all of them against a development
version of Open vSwitch. This commit introduces a Makefile target so that
one can run all the Ryu tests with a simple "make check-ryu".
This commit adds documentation for the new target to INSTALL. It also
moves the documentation for the "check-oftest" target and the
--enable-coverage configure option into INSTALL.
Signed-off-by: Ben Pfaff <blp@nicira.com> Reviewed-by: Simon Horman <horms@verge.net.au>
Jarno Rajahalme [Tue, 11 Feb 2014 23:34:39 +0000 (15:34 -0800)]
datapath: Fix race.
ovs_vport_cmd_dump() did rcu_read_lock() only after getting the
datapath, which could have been deleted in between. Resolved by
taking rcu_read_lock() before the get_dp() call.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Jarno Rajahalme [Tue, 11 Feb 2014 17:49:30 +0000 (09:49 -0800)]
ofproto: Move 'rule->used' to the provider.
Rule's 'used' timestamp is updated at the same time with the other
stats. So far the 'used' has been updated without proper protection,
which may lead to 'tearing' in 32-bit architectures, resulting in an
incorrect 'used' timestamp. While in practice this is highly
improbable, it is still better to handle this correctly.
This is resolved by moving the 'used' field to the provider's stats,
so that whatever protection is used for updating packet and byte
counts, can be also used for both reading and writing of the 'used'
timestamp.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Jarno Rajahalme [Fri, 7 Feb 2014 19:34:02 +0000 (11:34 -0800)]
ofproto: Optimize the case of a repeated learn action execution.
When the target flow exists already as intended, only the 'modified'
time needs to be updated. Allow modification without taking the
'ofproto_mutex' by always using the 'rule->mutex' when accessing the
'modified' time.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Sat, 8 Feb 2014 01:25:42 +0000 (17:25 -0800)]
netdev-dummy: Reduce reconnect back off timeout
netdev-dummy will mostly be used for testing and debugging over fairly
reliable connection. Reduce reconnect back off timeout in case the first
connect attempt failed.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Andy Zhou [Fri, 7 Feb 2014 22:45:14 +0000 (14:45 -0800)]
netdev-dummy: Fix reconnecting.
The netdev-dummy unit test ran into the reconnect condition on Jarno's
machine. With his test environment, we found and fixed the bugs in
handling reconnect.
Ben Pfaff [Tue, 11 Feb 2014 23:13:56 +0000 (15:13 -0800)]
tunnel: Support all combinations of flow-based and specific tunnel matches.
There are 12 possible ways to specify a tunnel (2 * 2 * 3 == 12):
- Specific in_key or flow-based (2 choices).
- Specific ip_dst or flow-based (2 choices).
- Specific ip_src, wildcarded, or flow-based (3 choices).
Until now, only 6 of the 12 possibilities have been supported. We
have had a couple of requests to add another. This commit adds all the
possibilities, so that we won't have to add the other 6 one by one.
Signed-off-by: Ben Pfaff <blp@nicira.com> Requested-by: Thomas Morin <thomas.morin@orange.com> Acked-by: pritesh <pritesh.kothari@cisco.com>
Alex Wang [Thu, 6 Feb 2014 23:46:05 +0000 (15:46 -0800)]
bond: Change the way of assigning bond slave for unassigned bond entry.
Before this commit, ovs randomly selects a slave for unassigned
bond entry. If the selected slave is not enabled, the active slave
is chosen instead. In this commit, the slave is selected from the
list of all enabled slaves in a round-robin fashion. This helps
improve the consistency of bond behavior when new flows are added.
Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Joe Stringer [Sat, 8 Feb 2014 00:39:54 +0000 (16:39 -0800)]
tests: Fix MPLS test cases.
The "userspace" MPLS test case was checking the same things as the
"drop" test case, rather than checking to see that packets were being
sent to userspace. This patch makes the testsuite consistent with itself.
Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
This commit creates events and through poll_fd_wait_event()
associates them with socket file descriptors to get woken up
from poll_block().
Some other changes:
* Windows does not have sys/fcntl.h but has a fcntl.h
On Linux, there is fctnl.h too.
* include <openssl/applink.c> to handle different C-Runtime linking
of OVS and openssl libraries as suggested at
https://www.openssl.org/support/faq.html#PROG2
The above include will not be needed if we compile Open vSwitch with
/MD compiler option.
* SHUT_RDWR is equivalent to SD_BOTH on Windows.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
poll-loop: Make poll_fd_wait_event() cross-platform.
This is helpful if we want to wait either on 'fd' for POSIX or
events for Windows.
For Windows, if both 'fd' and 'wevent' is specified, we associate
that event with the 'fd' using WSAEventSelect() in poll_block().
So any 'events' on that 'fd' will wake us up from WaitForMultipleObjects().
WSAEventSelect() does not understand POLLIN, POLLOUT etc. Instead the
macros it understands are FD_READ, FD_ACCEPT, FD_CONNECT, FD_CLOSE etc.
So we need to make that transition.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Tue, 11 Feb 2014 16:24:16 +0000 (08:24 -0800)]
ofproto-dpif-xlate: Make flows that match ICMP fields revalidate correctly.
ICMPv4 and ICMPv6 have 8-bit "type" and "code" fields. struct flow
uses the low 8 bits of the 16-bit tp_src and tp_dst members to
represent these fields. The datapath interface, on the other hand,
represents them with just 8 bits each. This means that if the high 8
bits of the masks for these fields somehow become set (meaning to
match on the nonexistent "high bits" of these fields) during
translation, then they will get chopped off by a round trip through
the datapath, and revalidation will spot that as an inconsistency and
delete the flow. This commit avoids the problem by making sure that
only the low 8 bits of either field can be unwildcarded for ICMP.
This seems like the minimal fix for this problem, appropriate for
backporting to earlier branches. The root of the issue is that these high
bits can get set in the match at all. I have some leads on that, but they
require more invasive changes elsewhere.
Bug #23320. Reported-by: Krishna Miriyala <miriyalak@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Ben Pfaff [Tue, 4 Feb 2014 17:01:16 +0000 (09:01 -0800)]
pcap-file: Allow capturing TCP streams where the SYN is not seen.
Until now, the tcp_stream() code has ignored any TCP packets that don't
correspond to a stream that began with a TCP SYN. This commit changes the
code so that any TCP packet that contains a payload starts a new stream.
Signed-off-by: Ben Pfaff <blp@nicira.com> Reported-by: Vasu Dasari <vdasari@gmail.com>
Ben Pfaff [Tue, 4 Feb 2014 20:39:37 +0000 (12:39 -0800)]
Always insert MPLS labels after VLAN tags.
OpenFlow 1.1 and 1.2 always inserted MPLS labels after VLAN tags.
OpenFlow 1.3 and 1.4 insert MPLS labels before VLAN tags.
OpenFlow 1.3.4 and 1.5, both in preparation, recognize that the change in
1.3 was an error and revert it. This commit implements that reversion
in Open vSwitch.
EXT-457. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Simon Horman <horms@verge.net.au>
YAMAMOTO Takashi [Fri, 24 Jan 2014 01:31:13 +0000 (10:31 +0900)]
daemon.at: Fix stderr races
When spawning ovsdb-server in background, redirect stderr to /dev/null.
Otherwise, the banner output (eg. "ovsdb-server (Open vSwitch) 2.1.90")
can mess stderr of the following commands and make these tests fail.
Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Signed-off-by: Ben Pfaff <blp@nicira.com>
daemon-windows: Ability to handle windows service calls.
The following code does not add any users yet.
The visioned workflow that this piece of code should work with is:
* Create a windows service through a startup script with
a tool like 'sc'
ex: sc create ovsdb-server binpath=
"C:\openvswitch\usr\sbin\ovsdb-server.exe -vconsole:off
-vsyslog:off -vfile:info --remote=ptcp:6632:127.0.0.1 --log-file
--service-monitor --service"
* Start the service from the startup script.
ex: sc start ovsdb-server
* Terminate the service during shutdown process.
ex: sc stop ovsdb-server
* Abrupt termination will restart the service.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ken Ajiro [Tue, 28 Jan 2014 01:20:43 +0000 (01:20 +0000)]
ovs-vsctl: Update will be discarded when multiple ovs-vsctl are executed
When two ovs-vsctl update map type column at same time, one ovs-vsctl's
update will be discarded although all ovs-vsctl succeeded. This patch
fixes this issue.
Signed-off-by: Ken Ajiro <ajiro@mxw.nes.nec.co.jp> Signed-off-by: Ben Pfaff <blp@nicira.com>
Simon Horman [Wed, 29 Jan 2014 06:13:19 +0000 (15:13 +0900)]
ofproto-dpif-xlate: Remove unused fitnessp pararameter from xlate_receive
Some functions pass a non-NULL value as this parameter
but none of those function uses the value xlate_receive()
returns there. So simply remove the parameter all together.
Also remove the now unused key_fitness field of struct flow_miss.
Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
This function must cast a const value to a non const value.
By adding an uintptr_t cast the warning is suppressed.
To avoid the cast (proper solution) several function signatures
must be changed.
Signed-off-by: Daniele Di Proietto <daniele.di.proietto@gmail.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
This change, firstly, avoids declaring the formal parameter const,
since it is treated as non const. (to avoid -Wcast-qual)
Secondly, it cast the pointer from void* to u8*, since it is used
in arithmetic (to avoid -Wpointer-arith)
Signed-off-by: Daniele Di Proietto <daniele.di.proietto@gmail.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
Andy Zhou [Fri, 31 Jan 2014 23:47:58 +0000 (15:47 -0800)]
datapth: Suppress error messages on megaflow updates
With subfacets, we'd expect megaflow updates message to carry
the original micro flow. If not, EINVAL is returned and kernel
logs an error message. Now that the user space subfacet layer is
removed, it is expected that flow updates can arrive with a
micro flow other than the original. Change the return code to
EEXIST and remove the kernel error log message.
Reported-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Andy Zhou <azhou@nicira.com>
Ben Pfaff [Tue, 10 Dec 2013 01:28:32 +0000 (17:28 -0800)]
socket-util: Remove unused functions.
A Windows porter mentioned to me that these functions caused special
trouble in the Windows port. However, they are no longer used, so we
might as well remove them.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Pravin B Shelar [Tue, 28 Jan 2014 02:18:33 +0000 (18:18 -0800)]
datapath: Fix ovs_flow_free() ovs-lock assert.
ovs_flow_free() is not called under ovs-lock during packet
execute path (ovs_packet_cmd_execute()). Since packet execute
does not touch flow->mask, there is no need to take that
lock either. So move assert in case where flow->mask is checked.
Found by code inspection.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
cccl: Handle library paths for one shot compilation.
When one wishes to compile and link a program with an
external library in one shot, additional
option "-link" is expected after all the other options. For example,
Ben Pfaff [Fri, 31 Jan 2014 00:57:16 +0000 (16:57 -0800)]
bridge: Set ofport column in every database transaction.
Database transactions can occasionally fail due to concurrent changes in
the database. When that happens, the next transaction should repeat the
changes that ovs-vswitchd tried to make the first time (adjusted for the
changes to the database).
The code to report the OpenFlow port number in use didn't do that. It set
the ofport field once when it created the port and never set it again, even
if the transaction to set it failed. This commit fixes the problem.
Ethan Jackson [Tue, 28 Jan 2014 00:40:27 +0000 (16:40 -0800)]
ofproto-dpif-upcall: Hardcode max_idle to 1500ms.
Before this patch, OVS tried to guess an optimal max idle time for
datapath flows based on the number of datapath flows relative to the
limit. This caused instability because the limit was based on the
dump duration which was affected by the max idle time. This patch
chooses instead to hardcode the max idle time to 1.5s except in
extreme case where the datapath flow limit is exceeded. 1.5s was
chosen to ensure pings occurring at once per second stay cached in the
datapath.
Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Joe Stringer <joestringer@nicira.com>
getopt_long: Copy over the implementation from netbsd.
Windows does not have a getopt_long function. This commit
copies over the getopt_long implementation from netbsd with
some minor modifications and is used only on Windows platform.
Modifications on top of the version in NetBSD repo.
* Remove header files not available in Visual Studio.
* Remove some unwanted #defines.
* Add Open vSwitch specific header files like config.h, vlog.h, util.h
* Add the following #define's
define __UNCONST(a) ((void *)(unsigned long)(const void *)(a))
define _DIAGASSERT(q) ovs_assert(q)
define warnx VLOG_WARN
* Add extern declaration in getopt.h for optarg, optind.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
syslog: Provide stub functions for openlog and syslog.
One option to implement openlog and syslog functionality in Windows
is to use windows event logger. But it looks like it involves changing
registry settings and in general looks complicated.
For the time being, do nothing for syslog. All the information needed for
debugging will be present through the 'file' option anyways.
We can start OVS daemons on Windows with "-vfile:info -vsyslog:off".
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Wed, 18 Dec 2013 21:55:25 +0000 (13:55 -0800)]
netdev-dummy: Add support for active stream
The dummy ports thus far only support passive connections. It can
listen for multiple incoming connection requests but not make active
connections. This patch adds support of active stream, so that a
dummy port can be configured with either passive or active connections.
The net result is that dummy ports can now connect to each other,
without being patch ports. This feature will be useful in adding test
cases of future commits.
commit c58cc9a460fd158e5250e59902e96ac677dc115f (datapath: Allow user space to
announce ability to accept unaligned Netlink messages) introduced
OVS_DP_ATTR_USER_FEATURES netlink attribute in datapath responses,
but the attribute size was not taken into account in ovs_dp_cmd_msg_size().
Signed-off-by: Daniele Di Proietto <daniele.di.proietto@gmail.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
Luigi Rizzo [Thu, 23 Jan 2014 16:24:03 +0000 (17:24 +0100)]
lib/pcap-file: add 'ovs_' prefix to pcap functions
This is done to avoid collisions and confusions with libpcap symbols,
like pcap_read()
Signed-off-by: Luigi Rizzo <rizzo@iet.unipi.it> Signed-off-by: Daniele Di Proietto <daniele.di.proietto@gmail.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ethan Jackson [Wed, 22 Jan 2014 01:55:41 +0000 (17:55 -0800)]
Makefile: Use AM_V_GEN instead of echo.
The AM_V_GEN macro fits more cleanly with the automake silent rules
option. When enabled it will print "GEN <filename>" instead of simply
echoing the command as before.
Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>