Co-authored-by: Linda Sun <lsun@vmware.com> Signed-off-by: Linda Sun <lsun@vmware.com> Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Thu, 20 Feb 2014 20:13:26 +0000 (12:13 -0800)]
ofproto: Update only OFPUTIL_PS_LINK_DOWN (not STP) from netdev state.
When a netdev indicates that its state or configuration has changed,
update_port() updates the OpenFlow port to match the changes. However,
this was being taken too far: a netdev does not have an STP state, and a
state change was resetting the STP state of the port. This fixes the
problem.
Simon Horman [Thu, 20 Feb 2014 04:48:10 +0000 (13:48 +0900)]
tests/run-ryu: Correct logfile reporting
$logfile is already prefixed by "$sandbox/" and suffixed by ".log"
so do not duplicate this prefix and suffix combination when appending
$logfile to $logs.
Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Wed, 15 Jan 2014 20:59:16 +0000 (12:59 -0800)]
netdev: Change netdev_class_rwlock to recursive mutex, for POSIX safety.
With glibc, rwlocks by default allow recursive read-locking even if a
thread is blocked waiting for the write-lock. POSIX allows such attempts
to deadlock, and it appears that the libc used in NetBSD, at least, does
deadlock. The netdev_class_rwlock is in fact acquired recursively in this
way, which is a bug. This commit fixes the problem by switching to a
recursive mutex. This allows for less parallelism, but according to an
existing comment that doesn't matter here anyway.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Joe Stringer <joestringer@nicira.com>
As of now, we are using the process subsystem in
ovsdb-server to handle the "--run" command line
option. That particular option is not used often
and till deemed necessary, make it unsupported on
Windows platform.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Simon Horman [Wed, 12 Feb 2014 01:13:02 +0000 (10:13 +0900)]
packets: Always set ethertype in push_mpls()
There are two different MPLS ethertypes, 0x8847 and 0x8848 and a push MPLS
action applied to an MPLS packet may cause the ethertype to change from one
to the other. To ensure that this happens update the ethertype in
push_mpls() regardless of if the packet is already MPLS or not.
Test based on a similar test by Joe Stringer.
Cc: Joe Stringer <joestringer@nicira.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Joe Stringer [Wed, 12 Feb 2014 01:13:01 +0000 (10:13 +0900)]
tests: Add MPLS push on MPLS test.
Signed-off-by: Joe Stringer <joestringer@nicira.com> Co-Authored-By: Simon Horman <horms@verge.net.au> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ansis Atteka [Tue, 18 Feb 2014 21:19:36 +0000 (13:19 -0800)]
ovs-vsctl: reconnect to the database if connection was dropped
If ovs-vsctl has to wait for ovs-vswitchd to reconfigure itself
according to the new database, then sometimes ovs-vsctl could
end up stuck in the event loop if OVSDB connection was dropped
while ovs-vsctl was still running.
This patch fixes this problem by letting ovs-vsctl to reconnect
to the OVSDB, if it has to wait cur_cfg field to be updated.
Joe Stringer [Wed, 19 Feb 2014 18:26:31 +0000 (10:26 -0800)]
ofproto: Remove 'force-miss-model' configuration.
This configuration item was introduced to assist testing of upcall
handling behaviour with and without facets. Facets were removed in
commit e79a6c833e0d7237, so this patch removes the configuration item.
Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Simon Horman [Wed, 19 Feb 2014 12:27:16 +0000 (21:27 +0900)]
ofp-actions: Correct pop MPLS ethtype as consistency test
Correct pop MPLS ethtype consistency check to verify that
the packet has an MPLS ethtype before the pop action rather than after:
an MPLS ethtype is a pre-condition but not a post-condition of pop MPLS.
With this change the consistency check in ofpact_check__()
becomes consistent with that in ofpact_from_nxast().
This was found using Ryu tests via the new make check-ryu target.
It allows all of the "POP_MPLS"[1] tests to pass where they previously
failed.
Jarno Rajahalme [Tue, 18 Feb 2014 17:07:03 +0000 (09:07 -0800)]
datapath: Per NUMA node flow stats.
Keep kernel flow stats for each NUMA node rather than each (logical)
CPU. This avoids using the per-CPU allocator and removes most of the
kernel-side OVS locking overhead otherwise on the top of perf reports
and allows OVS to scale better with higher number of threads.
With 9 handlers and 4 revalidators netperf TCP_CRR test flow setup
rate doubles on a server with two hyper-threaded physical CPUs (16
logical cores each) compared to the current OVS master. Tested with
non-trivial flow table with a TCP port match rule forcing all new
connections with unique port numbers to OVS userspace. The IP
addresses are still wildcarded, so the kernel flows are not considered
as exact match 5-tuple flows. This type of flows can be expected to
appear in large numbers as the result of more effective wildcarding
made possible by improvements in OVS userspace flow classifier.
There is a small increase in kernel spinlock overhead due to the same
spinlock being shared between multiple cores of the same physical CPU,
but that is barely visible in the netperf TCP_CRR test performance
(maybe ~1% performance drop, hard to tell exactly due to variance in
the test results), when testing for kernel module throughput (with no
userspace activity, handful of kernel flows).
On flow setup, a single stats instance is allocated (for the NUMA node
0). As CPUs from multiple NUMA nodes start updating stats, new
NUMA-node specific stats instances are allocated. This allocation on
the packet processing code path is made to never block or look for
emergency memory pools, minimizing the allocation latency. If the
allocation fails, the existing preallocated stats instance is used.
Also, if only CPUs from one NUMA-node are updating the preallocated
stats instance, no additional stats instances are allocated. This
eliminates the need to pre-allocate stats instances that will not be
used, also relieving the stats reader from the burden of reading stats
that are never used.
Ben Pfaff [Fri, 14 Feb 2014 18:34:58 +0000 (10:34 -0800)]
tests: Add support for automatically running Ryu tests against OVS.
The Ryu controller comes with an extensive library of OpenFlow tests, but
it doesn't seem so easy to me to run all of them against a development
version of Open vSwitch. This commit introduces a Makefile target so that
one can run all the Ryu tests with a simple "make check-ryu".
This commit adds documentation for the new target to INSTALL. It also
moves the documentation for the "check-oftest" target and the
--enable-coverage configure option into INSTALL.
Signed-off-by: Ben Pfaff <blp@nicira.com> Reviewed-by: Simon Horman <horms@verge.net.au>
Jarno Rajahalme [Tue, 11 Feb 2014 23:34:39 +0000 (15:34 -0800)]
datapath: Fix race.
ovs_vport_cmd_dump() did rcu_read_lock() only after getting the
datapath, which could have been deleted in between. Resolved by
taking rcu_read_lock() before the get_dp() call.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Jarno Rajahalme [Tue, 11 Feb 2014 17:49:30 +0000 (09:49 -0800)]
ofproto: Move 'rule->used' to the provider.
Rule's 'used' timestamp is updated at the same time with the other
stats. So far the 'used' has been updated without proper protection,
which may lead to 'tearing' in 32-bit architectures, resulting in an
incorrect 'used' timestamp. While in practice this is highly
improbable, it is still better to handle this correctly.
This is resolved by moving the 'used' field to the provider's stats,
so that whatever protection is used for updating packet and byte
counts, can be also used for both reading and writing of the 'used'
timestamp.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Jarno Rajahalme [Fri, 7 Feb 2014 19:34:02 +0000 (11:34 -0800)]
ofproto: Optimize the case of a repeated learn action execution.
When the target flow exists already as intended, only the 'modified'
time needs to be updated. Allow modification without taking the
'ofproto_mutex' by always using the 'rule->mutex' when accessing the
'modified' time.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Sat, 8 Feb 2014 01:25:42 +0000 (17:25 -0800)]
netdev-dummy: Reduce reconnect back off timeout
netdev-dummy will mostly be used for testing and debugging over fairly
reliable connection. Reduce reconnect back off timeout in case the first
connect attempt failed.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Andy Zhou [Fri, 7 Feb 2014 22:45:14 +0000 (14:45 -0800)]
netdev-dummy: Fix reconnecting.
The netdev-dummy unit test ran into the reconnect condition on Jarno's
machine. With his test environment, we found and fixed the bugs in
handling reconnect.
Ben Pfaff [Tue, 11 Feb 2014 23:13:56 +0000 (15:13 -0800)]
tunnel: Support all combinations of flow-based and specific tunnel matches.
There are 12 possible ways to specify a tunnel (2 * 2 * 3 == 12):
- Specific in_key or flow-based (2 choices).
- Specific ip_dst or flow-based (2 choices).
- Specific ip_src, wildcarded, or flow-based (3 choices).
Until now, only 6 of the 12 possibilities have been supported. We
have had a couple of requests to add another. This commit adds all the
possibilities, so that we won't have to add the other 6 one by one.
Signed-off-by: Ben Pfaff <blp@nicira.com> Requested-by: Thomas Morin <thomas.morin@orange.com> Acked-by: pritesh <pritesh.kothari@cisco.com>
Alex Wang [Thu, 6 Feb 2014 23:46:05 +0000 (15:46 -0800)]
bond: Change the way of assigning bond slave for unassigned bond entry.
Before this commit, ovs randomly selects a slave for unassigned
bond entry. If the selected slave is not enabled, the active slave
is chosen instead. In this commit, the slave is selected from the
list of all enabled slaves in a round-robin fashion. This helps
improve the consistency of bond behavior when new flows are added.
Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Joe Stringer [Sat, 8 Feb 2014 00:39:54 +0000 (16:39 -0800)]
tests: Fix MPLS test cases.
The "userspace" MPLS test case was checking the same things as the
"drop" test case, rather than checking to see that packets were being
sent to userspace. This patch makes the testsuite consistent with itself.
Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
This commit creates events and through poll_fd_wait_event()
associates them with socket file descriptors to get woken up
from poll_block().
Some other changes:
* Windows does not have sys/fcntl.h but has a fcntl.h
On Linux, there is fctnl.h too.
* include <openssl/applink.c> to handle different C-Runtime linking
of OVS and openssl libraries as suggested at
https://www.openssl.org/support/faq.html#PROG2
The above include will not be needed if we compile Open vSwitch with
/MD compiler option.
* SHUT_RDWR is equivalent to SD_BOTH on Windows.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
poll-loop: Make poll_fd_wait_event() cross-platform.
This is helpful if we want to wait either on 'fd' for POSIX or
events for Windows.
For Windows, if both 'fd' and 'wevent' is specified, we associate
that event with the 'fd' using WSAEventSelect() in poll_block().
So any 'events' on that 'fd' will wake us up from WaitForMultipleObjects().
WSAEventSelect() does not understand POLLIN, POLLOUT etc. Instead the
macros it understands are FD_READ, FD_ACCEPT, FD_CONNECT, FD_CLOSE etc.
So we need to make that transition.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Tue, 11 Feb 2014 16:24:16 +0000 (08:24 -0800)]
ofproto-dpif-xlate: Make flows that match ICMP fields revalidate correctly.
ICMPv4 and ICMPv6 have 8-bit "type" and "code" fields. struct flow
uses the low 8 bits of the 16-bit tp_src and tp_dst members to
represent these fields. The datapath interface, on the other hand,
represents them with just 8 bits each. This means that if the high 8
bits of the masks for these fields somehow become set (meaning to
match on the nonexistent "high bits" of these fields) during
translation, then they will get chopped off by a round trip through
the datapath, and revalidation will spot that as an inconsistency and
delete the flow. This commit avoids the problem by making sure that
only the low 8 bits of either field can be unwildcarded for ICMP.
This seems like the minimal fix for this problem, appropriate for
backporting to earlier branches. The root of the issue is that these high
bits can get set in the match at all. I have some leads on that, but they
require more invasive changes elsewhere.
Bug #23320. Reported-by: Krishna Miriyala <miriyalak@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Ben Pfaff [Tue, 4 Feb 2014 17:01:16 +0000 (09:01 -0800)]
pcap-file: Allow capturing TCP streams where the SYN is not seen.
Until now, the tcp_stream() code has ignored any TCP packets that don't
correspond to a stream that began with a TCP SYN. This commit changes the
code so that any TCP packet that contains a payload starts a new stream.
Signed-off-by: Ben Pfaff <blp@nicira.com> Reported-by: Vasu Dasari <vdasari@gmail.com>
Ben Pfaff [Tue, 4 Feb 2014 20:39:37 +0000 (12:39 -0800)]
Always insert MPLS labels after VLAN tags.
OpenFlow 1.1 and 1.2 always inserted MPLS labels after VLAN tags.
OpenFlow 1.3 and 1.4 insert MPLS labels before VLAN tags.
OpenFlow 1.3.4 and 1.5, both in preparation, recognize that the change in
1.3 was an error and revert it. This commit implements that reversion
in Open vSwitch.
EXT-457. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Simon Horman <horms@verge.net.au>
YAMAMOTO Takashi [Fri, 24 Jan 2014 01:31:13 +0000 (10:31 +0900)]
daemon.at: Fix stderr races
When spawning ovsdb-server in background, redirect stderr to /dev/null.
Otherwise, the banner output (eg. "ovsdb-server (Open vSwitch) 2.1.90")
can mess stderr of the following commands and make these tests fail.
Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Signed-off-by: Ben Pfaff <blp@nicira.com>
daemon-windows: Ability to handle windows service calls.
The following code does not add any users yet.
The visioned workflow that this piece of code should work with is:
* Create a windows service through a startup script with
a tool like 'sc'
ex: sc create ovsdb-server binpath=
"C:\openvswitch\usr\sbin\ovsdb-server.exe -vconsole:off
-vsyslog:off -vfile:info --remote=ptcp:6632:127.0.0.1 --log-file
--service-monitor --service"
* Start the service from the startup script.
ex: sc start ovsdb-server
* Terminate the service during shutdown process.
ex: sc stop ovsdb-server
* Abrupt termination will restart the service.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ken Ajiro [Tue, 28 Jan 2014 01:20:43 +0000 (01:20 +0000)]
ovs-vsctl: Update will be discarded when multiple ovs-vsctl are executed
When two ovs-vsctl update map type column at same time, one ovs-vsctl's
update will be discarded although all ovs-vsctl succeeded. This patch
fixes this issue.
Signed-off-by: Ken Ajiro <ajiro@mxw.nes.nec.co.jp> Signed-off-by: Ben Pfaff <blp@nicira.com>
Simon Horman [Wed, 29 Jan 2014 06:13:19 +0000 (15:13 +0900)]
ofproto-dpif-xlate: Remove unused fitnessp pararameter from xlate_receive
Some functions pass a non-NULL value as this parameter
but none of those function uses the value xlate_receive()
returns there. So simply remove the parameter all together.
Also remove the now unused key_fitness field of struct flow_miss.
Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
This function must cast a const value to a non const value.
By adding an uintptr_t cast the warning is suppressed.
To avoid the cast (proper solution) several function signatures
must be changed.
Signed-off-by: Daniele Di Proietto <daniele.di.proietto@gmail.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
This change, firstly, avoids declaring the formal parameter const,
since it is treated as non const. (to avoid -Wcast-qual)
Secondly, it cast the pointer from void* to u8*, since it is used
in arithmetic (to avoid -Wpointer-arith)
Signed-off-by: Daniele Di Proietto <daniele.di.proietto@gmail.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
Andy Zhou [Fri, 31 Jan 2014 23:47:58 +0000 (15:47 -0800)]
datapth: Suppress error messages on megaflow updates
With subfacets, we'd expect megaflow updates message to carry
the original micro flow. If not, EINVAL is returned and kernel
logs an error message. Now that the user space subfacet layer is
removed, it is expected that flow updates can arrive with a
micro flow other than the original. Change the return code to
EEXIST and remove the kernel error log message.
Reported-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Andy Zhou <azhou@nicira.com>
Ben Pfaff [Tue, 10 Dec 2013 01:28:32 +0000 (17:28 -0800)]
socket-util: Remove unused functions.
A Windows porter mentioned to me that these functions caused special
trouble in the Windows port. However, they are no longer used, so we
might as well remove them.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Pravin B Shelar [Tue, 28 Jan 2014 02:18:33 +0000 (18:18 -0800)]
datapath: Fix ovs_flow_free() ovs-lock assert.
ovs_flow_free() is not called under ovs-lock during packet
execute path (ovs_packet_cmd_execute()). Since packet execute
does not touch flow->mask, there is no need to take that
lock either. So move assert in case where flow->mask is checked.
Found by code inspection.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
cccl: Handle library paths for one shot compilation.
When one wishes to compile and link a program with an
external library in one shot, additional
option "-link" is expected after all the other options. For example,
Ben Pfaff [Fri, 31 Jan 2014 00:57:16 +0000 (16:57 -0800)]
bridge: Set ofport column in every database transaction.
Database transactions can occasionally fail due to concurrent changes in
the database. When that happens, the next transaction should repeat the
changes that ovs-vswitchd tried to make the first time (adjusted for the
changes to the database).
The code to report the OpenFlow port number in use didn't do that. It set
the ofport field once when it created the port and never set it again, even
if the transaction to set it failed. This commit fixes the problem.
Ethan Jackson [Tue, 28 Jan 2014 00:40:27 +0000 (16:40 -0800)]
ofproto-dpif-upcall: Hardcode max_idle to 1500ms.
Before this patch, OVS tried to guess an optimal max idle time for
datapath flows based on the number of datapath flows relative to the
limit. This caused instability because the limit was based on the
dump duration which was affected by the max idle time. This patch
chooses instead to hardcode the max idle time to 1.5s except in
extreme case where the datapath flow limit is exceeded. 1.5s was
chosen to ensure pings occurring at once per second stay cached in the
datapath.
Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Joe Stringer <joestringer@nicira.com>