Andy Zhou [Wed, 5 Mar 2014 23:27:31 +0000 (15:27 -0800)]
ofproto/bond: Implement bond megaflow using recirculation
Infrastructure to enable megaflow support for bond ports using
recirculation. This patch adds the following features:
* Generate RECIRC action when bond can benefit from recirculation.
* Populate post recirculation rules in a hidden table. Currently table 254.
* Uses post recirculation rules for bond rebalancing
* A recirculation implementation in dpif-netdev.
The goal of this patch is to be able to megaflow bond outputs and
thus greatly improve performance. However, this patch does not
actually improve the megaflow generation. It is left for a later commit.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Fri, 4 Apr 2014 20:37:33 +0000 (13:37 -0700)]
tests: Remove problematic but questionably useful part of ovsdb-server test.
The test "ovsdb-server combines updates on backlogged connections" first
checks that ovsdb-server can combine transactions, and then checks that
it can keep them separate. The latter part is hard to force because it
requires making sure that the socket buffer never fills up, and it also
isn't a very useful test (it doesn't check for any kind of correctness).
Therefore, this commit removes it.
Simon Horman [Mon, 7 Apr 2014 16:43:04 +0000 (16:43 +0000)]
ofp-parse: Handle buffer resize when parsing actions
A call to ofpbuf_put() may cause the data of the passed to be resized and
its base address to change. Thus the address returned by ofpbuf_put()
should be used as the base address rather than relying on the base address
prior to ofpbuf_put().
This avoids the following assertion in ofpact_update_len() from failing
when ofpbuf_put() causes a resize in parse_note(), parse_noargs_dec_ttl(),
parse_dec_ttl().
ovs_assert(ofpact == ofpacts->frame);
This restores pointer updates removed by and resolves a regression
introduced by cf3b7538666cd1ef ("ofpbuf: Abstract 'l2' pointer and document
usage conventions.").
Test cases have also been added to exercise this buffer resize in
parse_note(), parse_noargs_dec_ttl(), parse_dec_ttl().
Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
YAMAMOTO Takashi [Mon, 31 Mar 2014 06:11:19 +0000 (15:11 +0900)]
ofproto-dpif.at: Wait for the monitor's pidfile disappears where necessary
These tests invokes ovs-ofctl monitor twice or more.
Because "ovs-appctl -t ofctl exit" does not wait for the target
process exit, there are chances to see the pid file from the previous
incarnation.
OVS_APP_EXIT_AND_WAIT macro was provided by Ben Pfaff.
Acked-by: Ben Pfaff <blp@nicira.com> Co-authored-by: Ben Pfaff <blp@nicira.com> Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Ben Pfaff [Sat, 5 Apr 2014 02:26:22 +0000 (19:26 -0700)]
ofp-print: Fix misaligned data access in ofp_print_error_msg().
The body of an OpenFlow error message often contains an inner OpenFlow
message, and when it does, the inner message starts at an odd multiple of 4
bytes from the beginning of the outer message. That means that, on RISC
systems, accessing the inner message directly causes a bus error. This
commit fixes the problem in a way that should make it difficult to recur.
This fixes the failure of tests 643, 645, and 651 on sparc seen here:
https://buildd.debian.org/status/fetch.php?pkg=openvswitch&arch=sparc&ver=2.1.0%2Bgit20140325-1&stamp=1396438624
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Ben Pfaff [Sat, 5 Apr 2014 03:21:15 +0000 (20:21 -0700)]
packets: Fix misaligned data accesses for MPLS and SCTP fields.
The other 32-bit data fields in protocol headers were already using
ovs_16aligned_be32, but MPLS and SCTP had been overlooked. This fixes
the failure of test 681 seen here:
https://buildd.debian.org/status/fetch.php?pkg=openvswitch&arch=sparc&ver=2.1.0%2Bgit20140325-1&stamp=1396438624
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Ben Pfaff [Sat, 5 Apr 2014 17:27:05 +0000 (10:27 -0700)]
dpif-netdev: Unwildcard entire odp_port in dpif_netdev_mask_from_nlattrs().
One case in the dpif_netdev_mask_from_nlattrs() function accidentally
wildcarded only a 16-bit subset of the mask's odp_port. On little-endian
machines this subset was the lower bits, which happened to work out OK,
but on big-endian machines this subset was the upper bits, which doesn't
work and causes a test failure. (The problem was actually visible in the
test expected results on little-endian machines, but we had not noticed.)
This commit unwildcards the whole field, fixing the problem, and updates
the test expected results to match.
This fixes the failure of test 732 seen here:
https://buildd.debian.org/status/fetch.php?pkg=openvswitch&arch=sparc&ver=2.1.0%2Bgit20140325-1&stamp=1396438624
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Ben Pfaff [Thu, 3 Apr 2014 22:27:18 +0000 (15:27 -0700)]
jsonrpc: Return received JSON-RPC messages immediately in jsonrpc_recv().
Until now, jsonrpc_recv() used separate iterations of its loop to receive
data, feed it to the JSON-RPC parser, and return the received message.
This is unnecessarily complicated and can occasionally mean that the
jsonrpc object has received and parsed but not returned a message. This
commit refactors the code to receive data, feed it to the parse, and
return the received message in a single iteration, and simplifies the code
in the process.
Reported-by: Chris Hydon <chydon@aristanetworks.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Simon Horman [Wed, 2 Apr 2014 09:04:27 +0000 (18:04 +0900)]
ofproto: Support OF version-specific table-miss behaviours
OpenFlow 1.1 and 1.2 specify that if a table-miss occurs then the default
behaviour is to forward the packet the controller using a packet-in
message. And until this patch this is the default behaviour that Open
vSwitch uses for all OpenFlow versions.
OpenFlow1.3+ specifies that if a table-miss occurs then the default
behaviour is simply to drop the packet. This patch implements this
behaviour using the following logic:
If a table-miss occurs and the table-miss behaviour for the table
has not been set using a table_mod (in which case it is no longer
the default setting) then:
* Installing a facet in the datapath with a drop action in the
if there are no pre-OF1.3 controllers connected which would receive
an packet_in message.
Note that this covers both the case where there are only OF1.3
controllers and the case where there are no controllers at all.
* Otherwise sent a packet_in message to all pre-OF1.3 controllers.
This covers both the case where there are only pre-OF1.3
controllers and there are both pre-OF1.3 and OF1.3+ controllers.
Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Add support for building the in-tree kernel datapath for
Linux kernels up to 3.13. There were some changes in the
netlink area which required adding new compatibility code
for this layer. Also, some new per-cpu stats initialization
code was added.
It should be an administrator task to bring up devices as they
are configured properly.
Currently, Fedora is deleting the bridges when the interface is
brought down. Therefore, there is no bridge on the next boot and
the initscripts can apply the networking configuration properly
for a new bridge.
However, if the system didn't execute ifdown for some reason, the
bridge is left in the ovsdb and since internal ports are brought
up by default, there is no way for initscripts to known if the
adminitrator has already configured it or not.
ofpbuf: Abstract 'l2' pointer and document usage conventions.
Rename 'l2' to 'frame' and add new ofpbuf_set_frame() and ofpbuf_l2().
ofpbuf_set_frame() alse resets all the layer offsets. ofpbuf_l2()
returns NULL if the packet has no Ethernet header, as indicated either
by unset l3 offset or NULL frame pointer. Callers of ofpbuf_l2() are
supposed to check the return value, unless they can otherwise be sure
that the packet has a valid Ethernet header.
The recent commit 437d0d22 made some assumptions that were not valid
regarding the use of the 'l2' pointer in rconn module and by
compose_rarp(). This is now fixed as follows: rconn now relies on the
fact that once OpenFlow messages are given to rconn for transport, the
frame pointer is no longer needed to refer to the OpenFlow header; and
compose_rarp() now sets the frame pointer and offsets as expected.
In addition to storing network frames, ofpbufs are also used for
handling OpenFlow messages and action lists. lib/ofpbuf.h now has a
comment documenting the current usage conventions and invariants.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
ofpbuf: Rename trivial _get_ functions without the "get".
Code reads better without the "get", for example "ofpbuf_l3()"
v.s. "ofpbuf_get_l3()". L4 payoad access functions still use the
"get" (e.g., "ofpbuf_get_tcp_payload()").
Andy Zhou [Mon, 31 Mar 2014 20:38:04 +0000 (13:38 -0700)]
unit-test: Improve ovstest user interface
Improve help output. Running without argument now exit with an error
message and an error code. Simplify OVSTEST_REGISTER() since not all
test programs uses sub_commands.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Wed, 2 Apr 2014 15:43:17 +0000 (08:43 -0700)]
jsonrpc-server: Combine notifications when connection becomes backlogged.
Connections that queue up too much data, because they are monitoring a
table that is changing quickly and failing to keep up with the updates,
cause problems with buffer management. Since commit 60533a405b2e
(jsonrpc-server: Disconnect connections that queue too much data.),
ovsdb-server has dealt with them by disconnecting the connection and
letting them start up again with a fresh copy of the database. However,
this is not ideal because of situations where disconnection happens
repeatedly. For example:
- A manager toggles a column back and forth between two or more values
quickly (in which case the data transmitted over the monitoring
connections always increases quickly, without bound).
- A manager repeatedly extends the contents of some column in some row
(in which case the data transmitted over the monitoring connection
grows with O(n**2) in the length of the string).
A better way to deal with this problem is to combine updates when they are
sent to the monitoring connection, if that connection is not keeping up.
In both the above cases, this reduces the data that must be sent to a
manageable amount. This commit implements this new way.
Bug #1211786.
Bug #1221378. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Connections that queue up too much data, because they are monitoring a
table that is changing quickly and failing to keep up with the updates,
cause problems with buffer management. Since commit 60533a405b2e
(jsonrpc-server: Disconnect connections that queue too much data.),
ovsdb-server has dealt with them by disconnecting the connection and
letting them start up again with a fresh copy of the database. However,
this is not ideal because of situations where disconnection happens
repeatedly. For example:
- A manager toggles a column back and forth between two or more values
quickly (in which case the data transmitted over the monitoring
connections always increases quickly, without bound).
- A manager repeatedly extends the contents of some column in some row
(in which case the data transmitted over the monitoring connection
grows with O(n**2) in the length of the string).
A better way to deal with this problem is to combine updates when they are
sent to the monitoring connection, if that connection is not keeping up.
In both the above cases, this reduces the data that must be sent to a
manageable amount. An upcoming patch implements this new way. This commit
reverts part of the previous solution that disconnects backlogged
connections, since it is no longer useful.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Connections that queue up too much data, because they are monitoring a
table that is changing quickly and failing to keep up with the updates,
cause problems with buffer management. Since commit 60533a405b2e
(jsonrpc-server: Disconnect connections that queue too much data.),
ovsdb-server has dealt with them by disconnecting the connection and
letting them start up again with a fresh copy of the database. However,
this is not ideal because of situations where disconnection happens
repeatedly. For example:
- A manager toggles a column back and forth between two or more values
quickly (in which case the data transmitted over the monitoring
connections always increases quickly, without bound).
- A manager repeatedly extends the contents of some column in some row
(in which case the data transmitted over the monitoring connection
grows with O(n**2) in the length of the string).
A better way to deal with this problem is to combine updates when they are
sent to the monitoring connection, if that connection is not keeping up.
In both the above cases, this reduces the data that must be sent to a
manageable amount. An upcoming patch implements this new way. This commit
reverts part of the previous solution that disconnects backlogged
connections, since it is no longer useful.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Connections that queue up too much data, because they are monitoring a
table that is changing quickly and failing to keep up with the updates,
cause problems with buffer management. Since commit 60533a405b2e
(jsonrpc-server: Disconnect connections that queue too much data.),
ovsdb-server has dealt with them by disconnecting the connection and
letting them start up again with a fresh copy of the database. However,
this is not ideal because of situations where disconnection happens
repeatedly. For example:
- A manager toggles a column back and forth between two or more values
quickly (in which case the data transmitted over the monitoring
connections always increases quickly, without bound).
- A manager repeatedly extends the contents of some column in some row
(in which case the data transmitted over the monitoring connection
grows with O(n**2) in the length of the string).
A better way to deal with this problem is to combine updates when they are
sent to the monitoring connection, if that connection is not keeping up.
In both the above cases, this reduces the data that must be sent to a
manageable amount. An upcoming patch implements this new way. This commit
reverts part of the previous solution that disconnects backlogged
connections, since it is no longer useful.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Connections that queue up too much data, because they are monitoring a
table that is changing quickly and failing to keep up with the updates,
cause problems with buffer management. Since commit 60533a405b2e
(jsonrpc-server: Disconnect connections that queue too much data.),
ovsdb-server has dealt with them by disconnecting the connection and
letting them start up again with a fresh copy of the database. However,
this is not ideal because of situations where disconnection happens
repeatedly. For example:
- A manager toggles a column back and forth between two or more values
quickly (in which case the data transmitted over the monitoring
connections always increases quickly, without bound).
- A manager repeatedly extends the contents of some column in some row
(in which case the data transmitted over the monitoring connection
grows with O(n**2) in the length of the string).
A better way to deal with this problem is to combine updates when they are
sent to the monitoring connection, if that connection is not keeping up.
In both the above cases, this reduces the data that must be sent to a
manageable amount. An upcoming patch implements this new way. This commit
reverts part of the previous solution that disconnects backlogged
connections, since it is no longer useful.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Ben Pfaff [Tue, 1 Apr 2014 21:33:24 +0000 (14:33 -0700)]
ovsdb-server: Send update for _version for changes due to weak ref removal.
When a row was deleted, and that caused the transaction manager to remove
a remaining weak reference to the row, and no other change had been made
to the row in that transaction, and a client was monitoring modifications
of the _version column in the row, then the monitor update for the column
did not include the old contents of the _version column, even though it
should have. This commit fixes the problem.
Probably most clients only look at the new value of the column, if they
monitor _version at all, and this bug is really old, so it's probably
not a serious bug.
Found by inspection.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Ben Pfaff [Wed, 2 Apr 2014 21:54:51 +0000 (14:54 -0700)]
debian: Allow kmod or module-init-tools for backward compatibility.
Commit d473844693 (debian: Depend on 'kmod' instead of module-init-tools.)
switched from depending on module-init-tools to depending on kmod, which
is the new name of the appropriate package in Debian. Unfortunately,
while kmod is the right name for the latest Debian distribution, it doesn't
have that name in old distributions and thus breaks the build on those.
This commit should work OK in either case, since it allows both names.
Pravin Shelar [Mon, 31 Mar 2014 20:17:24 +0000 (13:17 -0700)]
netdev-dpdk: Remove alloc from packet recv.
On DPDK packet recv, ovs is given pointer to mbuf which has
information about a packet, for example pointer to data and size.
By moving mbuf to ofpbuf we can let dpdk allocate ofpbuf and
pass that to ovs for processing the packet.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
ovs_flow_cmd_del() now allocates reply (if needed) after the flow has
already been removed from the flow table. If the reply allocation
fails, a netlink error is signaled with netlink_set_err(), as is
already done in ovs_flow_cmd_new_or_set() in the similar situation.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Reduce and clarify locking requirements for ovs_flow_cmd_alloc_info(),
ovs_flow_cmd_fill_info() and ovs_flow_cmd_build_info().
A datapath pointer is available only when holding a lock. Change
ovs_flow_cmd_fill_info() and ovs_flow_cmd_build_info() to take a
dp_ifindex directly, rather than a datapath pointer that is then
(only) used to get the dp_ifindex. This is useful, since the
dp_ifindex is available even when the datapath pointer is not, both
before and after taking a lock, which makes further critical section
reduction possible.
Make ovs_flow_cmd_alloc_info() take an 'acts' argument instead a
'flow' pointer. This allows some future patches to do the allocation
before acquiring the flow pointer.
The locking requirements after this patch are:
ovs_flow_cmd_alloc_info(): May be called without locking, must not be
called while holding the RCU read lock (due to memory allocation).
If 'acts' belong to a flow in the flow table, however, then the
caller must hold ovs_mutex.
ovs_flow_cmd_fill_info(): Either ovs_mutex or RCU read lock must be held.
ovs_flow_cmd_build_info(): This calls both of the above, so the caller
must hold ovs_mutex.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Bruce Davie [Wed, 2 Apr 2014 16:11:48 +0000 (09:11 -0700)]
vtep: Add IP address configuration for bfd.
The OVS implementation of BFD allows configuration of the source and
destination IP addresses of BFD packets. This patch adds the same
configuration option to the VTEP schema.
Signed-off-by: Bruce Davie <bsd@nicira.com> Acked-by: Alex Wang <alexw@nicira.com>
YAMAMOTO Takashi [Mon, 31 Mar 2014 05:04:35 +0000 (14:04 +0900)]
ofproto.at: Fix races in rule eviciton tests
Bump timeout differences, because timeouts different by 1s might end up
to have the same position in the heap as rule_eviction_priority() uses
1024ms as a unit.
Also, use time/stop to avoid relying on how long an add-flow would take.
These tests were introduced by commit 6d56c1f1.
("ofproto: Update rule's priority in eviction group.")
Acked-by: Ben Pfaff <blp@nicira.com> Acked-by: Kmindg G <kmindg@gmail.com> Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Andy Zhou [Sat, 29 Mar 2014 01:13:00 +0000 (18:13 -0700)]
unit-test: Add ovstest
Changing one of the files in the Open vSwitch ``lib'' directory
causes 43 binaries to be relinked, which takes a lot of time even with
parallel ``make''. 31 of those binaries are in the ``tests''
directory. ovs-test attemps to combine most of those binaries into a
single test program that just takes a subcommand name as its first
command-line argument.
The following patch makes use of this infrastructure.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Windows does not have a sleep(seconds). But it does have
a Sleep(milliseconds). Sleep() in windows does not have a
return value. Since we are not using the return value for xsleep()
anywhere as of now, don't return any.
Introduced by commit 275eebb9 (utils: Introduce xsleep for RCU quiescent state)
CC: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Jarno Rajahalme [Mon, 24 Mar 2014 16:17:01 +0000 (09:17 -0700)]
lib/ofpbuf: Compact
This patch shrinks the struct ofpbuf from 104 to 48 bytes on 64-bit
systems, or from 52 to 36 bytes on 32-bit systems (counting in the
'l7' removal from an earlier patch). This may help contribute to
cache efficiency, and will speed up initializing, copying and
manipulating ofpbufs. This is potentially important for the DPDK
datapath, but the rest of the code base may also see a little benefit.
Changes are:
- Remove 'l7' pointer (previous patch).
- Use offsets instead of layer pointers for l2_5, l3, and l4 using
'l2' as basis. Usually 'data' is the same as 'l2', but this is not
always the case (e.g., when parsing or constructing a packet), so it
can not be easily used as the offset basis. Also, packet parsing is
faster if we do not need to maintain the offsets each time we pull
data from the ofpbuf.
- Use uint32_t for 'allocated' and 'size', as 2^32 is enough even for
largest possible messages/packets.
- Use packed enum for 'source'.
- Rearrange to avoid unnecessary padding.
- Remove 'private_p', which was used only in two cases, both of which
had the invariant ('l2' == 'data'), so we can temporarily use 'l2'
as a private pointer.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Jarno Rajahalme [Fri, 28 Mar 2014 20:44:23 +0000 (13:44 -0700)]
datapath: Minimize dp and vport critical sections.
Move most memory allocations away from the ovs_mutex critical
sections. vport allocations still happen while the lock is taken, as
changing that would require major refactoring. Also, vports are
created very rarely so it should not matter.
Change ovs_dp_cmd_get() now only takes the rcu_read_lock(), rather
than ovs_lock(), as nothing need to be changed. This was done by
ovs_vport_cmd_get() already.
Jarno Rajahalme [Sat, 29 Mar 2014 22:52:32 +0000 (15:52 -0700)]
datapath: Make flow mask removal symmetric.
Masks are inserted when flows are inserted to the table, so it is
logical to correspondingly remove masks when flows are removed from
the table, in ovs_flow_table_remove().
This allows ovs_flow_free() to be called without locking, which will
be used by later patches.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>Summary:
Jarno Rajahalme [Sat, 29 Mar 2014 22:52:32 +0000 (15:52 -0700)]
datapath: Build flow cmd netlink reply only if needed.
Use netlink_has_listeners() and NLM_F_ECHO flag to determine if a
reply is needed or not for OVS_FLOW_CMD_NEW, OVS_FLOW_CMD_SET, or
OVS_FLOW_CMD_DEL. Currently, OVS userspace does not request a reply
for OVS_FLOW_CMD_NEW, but usually does for OVS_FLOW_CMD_DEL, as stats
may have changed.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Monam Agarwal [Fri, 28 Mar 2014 03:12:22 +0000 (20:12 -0700)]
datapath: Use with RCU_INIT_POINTER(x, NULL) in vport-gre.c
This patch replaces rcu_assign_pointer(x, NULL) with RCU_INIT_POINTER(x, NULL)
The rcu_assign_pointer() ensures that the initialization of a structure
is carried out before storing a pointer to that structure.
And in the case of the NULL pointer, there is no structure to initialize.
So, rcu_assign_pointer(p, NULL) can be safely converted to RCU_INIT_POINTER(p, NULL)
Pritesh Kothari [Fri, 28 Mar 2014 19:20:00 +0000 (12:20 -0700)]
sparse: workaround for a bug in sparse.
sparse emits the following warning:
lib/dpif-netdev.c:1755:15: warning: Initializer entry defined twice
lib/dpif-netdev.c:1755:15: also defined here
due to a bug in sparse which doesn't like inlined functions which
expands a #define within it. This commit removes inline to make
sparse happy.
Signed-off-by: Pritesh Kothari <pritesh.kothari@cisco.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Pritesh Kothari [Fri, 28 Mar 2014 19:19:59 +0000 (12:19 -0700)]
sparse: fix the order of include paths to make sparse happy.
This fix restores the order of include path such that the local include/ comes
before the system /usr/include in the #include path. Thus by making sure that
include/linux/types.h and include/linux/openvswitch.h take precedence over the
similar files in /usr/include/ directory.
Signed-off-by: Pritesh Kothari <pritesh.kothari@cisco.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Fri, 28 Mar 2014 02:08:36 +0000 (19:08 -0700)]
lib/hash.h: fix hash_2words
Number of bytes in 2 words should be 8, instead of 4 bytes,
to better follow the mhash_finish() API. It is unlikely the fix
will improve the quality of hashing results.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
For Windows, use a kernel assigned localhost TCP port to listen for
runtime management connections and then write it into a file
so that a client can read it and then make a TCP connection.
Since we do not have the infrastructure to create pidfiles on
windows as of now, we create the *.ctl file without a pid. This
should be okay since we use different OVS_RUNDIR when we run
multiple copies of a daemon.
We do not generate man pages on Windows. But we still update them
for Windows so that anyone can read it elsewhere. Since we do not
generate it directly, we cannot dynamically show the configured
OVS_RUNDIR in windows. So, I have a not so nice \fIOVS_RUNDIR\fR
in the man page.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
stream-tcp: Use closesocket instead of close for sockets.
We should use closesocket() while closing sockets so that
closing sockets work fine on both POSIX and Windows.
(In POSIX, we #define closesocket close)
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Alex Wang [Tue, 25 Mar 2014 20:57:25 +0000 (13:57 -0700)]
ofproto-dpif-upcall: Fix a race.
Commit 61057e884ca9c(ofproto-dpif-upcall: Slightly simplify
udpif_upcall_handler().) restructured the main loop in
udpif_upcall_handler() and discarded the check for the
'exit_latch' after acquiring the mutex. This makes it
possible for the following race:
- main thread sets the 'exit_latch' after the handler thread
checking it.
- main thread acquires the handler thread mutex and signals the
condition variable of handler thread.
- main thread releases the mutex and 'join' the handler thread.
- handler thread acquires the mutex, finds that n_upcalls is 0
and waits on the signal of condition variable.
- then OVS will hang forever.
This commit fixes the above issue by adding a check for the
'exit_latch' after acquiring the mutex.
Andy Zhou [Thu, 27 Mar 2014 16:10:31 +0000 (17:10 +0100)]
ovs-vsctl: Improve error reporting
ovs-vsctl is a command-line interface to the Open vSwitch database,
and as such it just modifies the desired Open vSwitch configuration in
the database. ovs-vswitchd, on the other hand, monitors the database
and implements the actual configuration specified in the database.
This can lead to surprises when the user makes a change to the
database, with ovs-vsctl, that ovs-vswitchd cannot actually
implement. In such a case, the ovs-vsctl command silently succeeds
(because the database was successfully updated) but its desired
effects don't actually take place. One good example of such a change
is attempting to add a port with a misspelled name (e.g. ``ovs-vsctl
add-port br0 fth0'', where fth0 should be eth0); another is creating
a bridge or a port whose name is longer than supported
(e.g. ``ovs-vsctl add-br'' with a 16-character bridge name on
Linux). It can take users a long time to realize the error, because it
requires looking through the ovs-vswitchd log.
The patch improves the situation by checking whether operations that
ovs executes succeed and report an error when
they do not. This patch only report add-br and add-port
operation errors by examining the `ofport' value that
ovs-vswitchd stores into the database record for the newly created
interface. Until ovs-vswitchd finishes implementing the new
configuration, this column is empty, and after it finishes it is
either -1 (on failure) or a positive number (on success).
Signed-off-by: Andy Zhou <azhou@nicira.com> Co-authored-by: Thomas Graf <tgraf@redhat.com> Signed-off-by: Thomas Graf <tgraf@redhat.com> Co-authored-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Fri, 28 Mar 2014 02:11:20 +0000 (19:11 -0700)]
ovsdb-idlc: Generate new *_get_for_uuid() functions.
This allows a client to obtain the IDL version of a row given its UUID.
It isn't normally useful, but there's a specialized use case for getting
the IDL version of a row given the UUID returned by
ovsdb_idl_txn_get_insert_uuid() following transaction commit.
An alternative would be to generate table-specific versions of
ovsdb_idl_txn_get_insert_uuid(). That seems reasonable to me too.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Jarno Rajahalme [Tue, 25 Mar 2014 22:26:23 +0000 (15:26 -0700)]
lib/ofpbuf: Remove 'l7' pointer.
Now that we don't need to parse TCP flags from the packet after
extraction, we usually do not need the 'l7' pointer any more. When
needed, ofpbuf_get_tcp|udp|sctp|icmp_payload() or ofpbuf_get_l4_size()
can be used instead.
Removal of 'l7' was requested by Pravin for the DPDK datapath work, as
it simplifies packet parsing a bit.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Jarno Rajahalme [Tue, 25 Mar 2014 22:26:23 +0000 (15:26 -0700)]
lib/ofpbuf: Inline the trivial ofpbuf functions.
Inline the most trivial ofpbuf functions to allow for better optimization.
Also inline the most often used ofpbuf_pull() and ofpbuf_try_pull(), which
should help streamline packet parsing.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Tue, 4 Mar 2014 23:36:03 +0000 (15:36 -0800)]
dpif-netdev: user space datapath recirculation
Add basic recirculation infrastructure and user space
data path support for it. The following bond mega flow patch will
make use of this infrastructure.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Mon, 27 Jan 2014 09:18:30 +0000 (01:18 -0800)]
ofproto-dpif: Added Per backer recirculation ID management
Recirculation ID needs to be unique per datapath. Its usage will be
tracked by the backer that corresponds to the datapath.
In theory, Recirculation ID can be any uint32_t value, except 0. This
implementation limits to a smaller range just for ease of debugging.
Make the range size 0 effectively disables recirculation.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
YAMAMOTO Takashi [Tue, 25 Mar 2014 20:13:43 +0000 (05:13 +0900)]
netdev-bsd: compilation fixes
This fixes regressions from commit f7791740
("netdev: Rename netdev_rx to netdev_rxq")
and commit df1e5a3b.
("netdev: Extend rx_recv to pass multiple packets.")
Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Signed-off-by: Ben Pfaff <blp@nicira.com>
Pravin [Tue, 25 Mar 2014 02:23:08 +0000 (19:23 -0700)]
dpif-netdev: Add DPDK netdev.
Following patch adds DPDK netdev-class to userspace datapath. Now
OVS can use DPDK port for IO by just configuring DPDK port and then
adding dpdk type port to userspace datapath.
Refer to INSTALL.DPDK doc for further info.
This is based a patch from Gerald Rogers.
Signed-off-by: Gerald Rogers <gerald.rogers@intel.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Thomas Graf <tgraf@redhat.com>
Pravin [Thu, 20 Mar 2014 17:57:41 +0000 (10:57 -0700)]
dpif-netdev: Add poll-mode-device thread.
This patch adds PMD type netdev for netdevice with poll-mode
drivers. Since there is no way to get signal on a packet recv
from these devices we need to poll them in busy loop. So minimize
system call overhead this patch uses dpif-thread exclusively
for PMD devices and rest of devices which needs system calls to
do IO are moved to dpif-netdev-run().
PMD device like DPDK work in userspace so there is no system call
overhead for them.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Thomas Graf <tgraf@redhat.com>
Pravin [Thu, 20 Mar 2014 17:54:37 +0000 (10:54 -0700)]
netdev: Extend rx_recv to pass multiple packets.
DPDK can receive multiple packets but current netdev API does
not allow that. Following patch allows dpif-netdev receive batch
of packet in a rx_recv() call for any netdev port. This will be
used by dpdk-netdev.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Jarno Rajahalme [Tue, 25 Mar 2014 00:34:48 +0000 (17:34 -0700)]
datapath: Avoid assigning a NULL pointer to flow actions.
Flow SET can accept an empty set of actions, with the intended
semantics of leaving existing actions unmodified. This seems to have
been brokin after OVS 1.7, as we have assigned the flow's actions
pointer to NULL in this case, but we never check for the NULL pointer
later on. This patch restores the intended behavior and documents it
in the include/linux/openvswitch.h.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>