Ben Pfaff [Wed, 18 Apr 2012 00:07:00 +0000 (17:07 -0700)]
ovs-vsctl: Merge struct vsctl_info into struct vsctl_context.
To speed up management operations with many ports, we need to preserve the
cache of bridges, ports, and interfaces from one operation to the next.
One necessary step is to push the "struct vsctl_info" that did the caching
up from the individual functions that need it into a more global structure.
This commit does that, merging it into struct vsctl_context.
This commit also modifies do_vsctl(), the top-level control code in
ovs-vsctl, to keep this part of the vsctl_context unchanged from running
one command to the next.
Ben Pfaff [Tue, 17 Apr 2012 20:50:53 +0000 (13:50 -0700)]
ovs-vsctl: Verify VLAN bridge controllers in cmd_get_controller().
A VLAN bridge uses its parent's controllers, so checking the controller
should verify the parent's set of controllers.
The change to verify_controllers() isn't necessary; it just deletes
the check for a null 'bridge' because verify_controllers() can no
longer be called with a null 'bridge'.
This fixes a bug, but it is unlikely to ever have caused a real problem for
users.
Ben Pfaff [Mon, 16 Apr 2012 22:54:37 +0000 (15:54 -0700)]
ofproto-dpif: Avoid extra flow copy in xlate_actions() for unneeded warnings.
The copy of the extra flow copy here was showing up in profiles. We only
need this copy if we end up doing a "trace" to warn the user. Most runs
won't ever do that, so don't start making copies until we actually hit
such a case.
This has a small behavioral change in that we'll only get a warning on the
*second* time we hit the resubmit recursion limit, not on the first. I
doubt that's really a problem.
Ben Pfaff [Thu, 19 Apr 2012 00:11:10 +0000 (17:11 -0700)]
ofproto-dpif: Implement "flow setup governor" to speed up many short flows.
The cost of creating and initializing facets and subfacets and installing,
tracking, and uninstalling kernel flows is significant. When most flows
have only one or a few packets, this overhead is higher than the cost of
handling each packet individually. This commit introduces heuristics that
cheaply count (approximately) the number of packets seen in a flow and
skips most of this expensive bookkeeping until the packet count exceeds a
threshold (currently 5 packets).
Ben Pfaff [Fri, 13 Apr 2012 21:09:10 +0000 (14:09 -0700)]
ofproto-dpif: Make it easier to credit statistics for resubmits.
Until now, crediting statistics to OpenFlow rules due to "resubmit" actions
has required setting up a "resubmit hook" with a callback function and
auxiliary data. This commit makes it easier to do, by adding a member to
struct action_xlate_ctx that specifies statistics to credit to each
resubmitted rule.
This commit includes one small behavioral change as an optimization.
Previously, rule_execute() translated the rule twice: once to get the ODP
actions, then a second time after executing the ODP actions to credit
statistics to the rules. After this commit, rule_execute() translates the
rule only once, crediting statistics as a side effect. The difference only
becomes visible when executing the actions fails: previously the statistics
would not be incremented, after this commit they will be. It is very
unusual for executing actions to fail (generally this indicates a bug) so
I'm not concerned about it.
Ben Pfaff [Mon, 9 Apr 2012 22:49:22 +0000 (15:49 -0700)]
classifier: Optimize search of "catchall" table.
Most flow tables have some kind of "catchall" rule that matches every
packet. For this table, the cost of copying, zeroing, and hashing the
input flow is significant. This patch avoids these costs.
Ben Pfaff [Sat, 7 Apr 2012 00:11:18 +0000 (17:11 -0700)]
ofproto-dpif: Avoid malloc() of "struct flow_miss".
In addition to avoid malloc() for struct flow_miss, this commit avoids
copying "struct flow" around, which is a significant benefit because
struct flow is currently 144 bytes.
Ben Pfaff [Fri, 6 Apr 2012 22:30:21 +0000 (15:30 -0700)]
ofproto-dpif: Avoid malloc() in common case for allocating subfacets.
Usually a facet has exactly one subfacet that has the same lifetime as
the facet. Allocating both the facet and its subfacet in a single memory
block improves performance.
Ben Pfaff [Mon, 16 Apr 2012 23:01:01 +0000 (16:01 -0700)]
netlink: Postpone choosing sequence numbers until send time.
Choosing sequence numbers at time of creating a packet means that
nl_sock_transact_multiple() has to search for the sequence number
of a reply, because the sequence numbers of the requests aren't
necessarily sequential. This commit makes it possible to avoid
the search, by deferring choice of sequence numbers until the
time that we send the packets. It doesn't actually modify
nl_sock_transact_multiple(), which will happen in a later commit.
Previously, I was concerned about a theoretical race condition
described in a comment in the old versino of this code:
This implementation uses sequence numbers that are unique
process-wide, to avoid a hypothetical race: send request, close
socket, open new socket that reuses the old socket's PID value,
send request on new socket, receive reply from kernel to old
socket but with same PID and sequence number. (This race could be
avoided other ways, e.g. by preventing PIDs from being quickly
reused).
However, I no longer believe that this can be a real problem,
because Netlink operates synchronously. The reply to a request
will always arrive before the socket can be closed and a new
socket opened with the old socket's PID.
Ben Pfaff [Thu, 5 Apr 2012 20:43:24 +0000 (13:43 -0700)]
ofproto-dpif: Don't do an extra flow translation when removing facets.
When we remove a facet, it seems superfluous to translate the flow
once more just to update the MAC learning table, since we know that
it was done within a second or so anyway (e.g. when the flow stats
were last updated).
Ansis Atteka [Fri, 30 Mar 2012 02:03:08 +0000 (19:03 -0700)]
ovs-test: Enhancements to the ovs-test tool
-Implemented support for ovs-test client, so that it could automatically
spawn an ovs-test server process from itself. This reduces the number of
commands the user have to type to get tests running.
-Automated creation of OVS bridges and ports (for VLAN and GRE tests), so that
user would not need to invoke ovs-vsctl manually to switch from direct, 802.1Q
and GRE tests.
-Fixed some pylint reported warnings.
-Fixed ethtool invocation so that we always try to query the physical interface
to get the driver name and version.
-and some others enhancements.
The new usage:
Node1:ovs-test -s 15531
Node2:ovs-test -c 127.0.0.1,1.1.1.1 192.168.122.151,1.1.1.2 -d -l 125 -t gre
Ethan Jackson [Mon, 16 Apr 2012 22:01:09 +0000 (15:01 -0700)]
lacp: Remove heartbeat mode.
The LACP heartbeat mode was used to monitor interfaces for
connectivity. It turns out that LACP is inferior to CFM for this
purpose. For the sake of simplicity this patch removes the
feature.
Ethan Jackson [Mon, 16 Apr 2012 21:55:58 +0000 (14:55 -0700)]
lacp: Remove custom transmission intervals.
Open vSwitch allowed users to set a custom LACP PDU transmission
interval. This turned out to be an ill conceived idea which was
more confusing than useful. This patch reverts Open vSwitch to the
behavior supported in the LACP specification: two transmission
intervals, fast and slow.
Ethan Jackson [Mon, 16 Apr 2012 19:09:49 +0000 (12:09 -0700)]
vswitch: Use consistent representation of DSCP bits.
There are two sensible ways to represent the 6 DSCP bits of an IP
packet. One could represent them as an integer in the range 0 to
63. Or one could represent them as they would appear in the tos
field (0 to 63) << 2. Before this patch, OVS had used the former
method for the DSCP bits in the Queue Table, and the latter for the
DSCP in the Controller and Manager tables. Since the ability to
set DSCP bits in the Controller and Manager tables is so new that
it hasn't been released yet, this patch changes it to use the
existing style employed in the Queue table. Hopefully this should
make the code and configuration less confusing.
Ethan Jackson [Mon, 16 Apr 2012 20:56:58 +0000 (13:56 -0700)]
socket-util: Remove DSCP_INVALID.
The DSCP_INVALID flag allowed callers to prevent socket-util from
modify the DSCP bits of newly created sockets. However, the two
really important callers (implementations of the controller and
manager tables) never used it. Furthermore, the other callers
would be fine always setting the DSCP bits to zero. This patch
removes the DSCP_INVALID option in an effort to simplify the code.
Ethan Jackson [Thu, 12 Apr 2012 02:52:46 +0000 (19:52 -0700)]
lib: Pull dscp bits out of reconnect.
The DSCP bits of a connection have nothing to do with the
reconnection state machine. This pulls them up into jsonrpc which
needs them to properly establish connections.
Ethan Jackson [Mon, 16 Apr 2012 19:46:46 +0000 (12:46 -0700)]
socket-util: Close socket on failed dscp modification.
If socket-util failed to modify the dscp bits of an active
connection, it would fail to close the file descriptor potentially
causing a leak. Found by inspection.
Ben Pfaff [Sat, 14 Apr 2012 04:24:17 +0000 (21:24 -0700)]
learn: Make it possible to parse "load" actions wider than 64 bits.
The implementation of the "learn" action now properly implements
specifications such as 0x20010db885a308d313198a2e03707348->NXM_NX_IPV6_DST
but the parser used in ovs-ofctl and elsewhere could not generate such
specifications. This commit adds that support.
Ben Pfaff [Wed, 11 Apr 2012 21:45:34 +0000 (14:45 -0700)]
meta-flow: New functions for reading and writing generalized subfields.
The existing functions for reading and writing the values of subfields only
handle subfields up to 64 bits wide. These new functions handle subfields
of any width.
Ethan Jackson [Sat, 7 Apr 2012 01:38:48 +0000 (18:38 -0700)]
bridge: Rate limit port creations and deletions.
In some datapaths, adding or deleting OpenFlow ports can take quite
a bit of time. If there are lots of OpenFlow ports which needed to
be added in a run loop, this can cause Open vSwitch to lock up and
stop setting up flows while trying to catch up. This patch lessons
the severity of the problem by only doing a few OpenFlow port adds
or deletions per run loop allowing other work to be done in
between.
Bug #10672. Signed-off-by: Ethan Jackson <ethan@nicira.com>
Ben Pfaff [Tue, 27 Mar 2012 17:16:52 +0000 (10:16 -0700)]
ovsdb-idl: Simplify transaction retry.
Originally the IDL transaction state machine had a return value
TXN_TRY_AGAIN to signal the client to wait for a change in the database and
then retry its transaction. However, this logic was incomplete, because
it was possible for the database to change before the reply to the
transaction RPC was received, in which case the client would wait for a
further change. Commit 4fdfe5ccf84c (ovsdb-idl: Prevent occasional hang
when multiple database clients race.) fixed the problem by breaking
TXN_TRY_AGAIN into two status codes, TXN_AGAIN_WAIT that meant to wait for
a further change and TXN_AGAIN_NOW that meant that a change had already
occurred so try again immediately.
This is correct enough, but it is more complicated than necessary. It is
simpler and just as correct to use a single "try again" status that
requires the client to wait for a change relative to the database contents
*before* the transaction was committed. This commit makes that change.
It also changes ovsdb_idl_run()'s return type from bool to void because
its return type is hardly useful anymore.
Python binaries to write comment in db show-log while updating database
This is an improvement in python script ovs-xapi-sync to call add_comment()
method while updating database. This will help developer to debug binaries
and database from ovsdb-tool show-log command output and understand which
python binary is updating db.
Feature #10543 Signed-off-by: Arun Sharma <arun.sharma@calsoftinc.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
vswitchd: Remove port from datapath if it becomes non-operational
If kernel module rejects config changes then vswitchd sets the ofport
column to -1, but does not remove the non-operational port from the
datapath. This patch fixes this problem.
ovs-vsctl add-br ovsbr
ovs-vsctl add-port ovsbr p1
ovs-vsctl add-port ovsbr p2
ovs-vsctl set Interface p1 options:remote_ip=2.1.1.1 options:key=123 type=gre
ovs-vsctl set Interface p2 options:remote_ip=1.1.1.1 options:key=123 type=gre
ovs-vsctl set Interface p2 options:remote_ip=2.1.1.1 options:key=123 type=gre
ovs-dpctl show #observe that p2 does not appear here anymore
Ethan Jackson [Fri, 6 Apr 2012 18:47:51 +0000 (11:47 -0700)]
util: New function set_program_name_version().
With this function, users of the Open vSwitch libraries which
should not have the same version as Open vSwitch will have their
version displayed in unixctl and at the command line.
The CFM packets that are out of sequence or contain invalid cfm_interval were
previously not ignored. The behavior is changed with this patch to not
process those CFM frames.
The changes display the cfm_health of an interface. The cfm_health
is an exponential weighted moving average of the health of all
remote_mpids. The value can vary from 0 to 100, 100 being very healthy
and 0 being unhealthy.
Feature #10363 Requested-by: Ethan Jackson <ethan@nicira.com> Signed-off-by: Mehak Mahajan <mmahajan@nicira.com>
Mehak Mahajan [Thu, 29 Mar 2012 21:34:51 +0000 (14:34 -0700)]
Granular link health statistics for cfm.
The changes display the cfm_health of an interface. The cfm_health
is an exponential weighted moving average of the health of all
remote_mpids. The value can vary from 0 to 100, 100 being very healthy
and 0 being unhealthy.
Feature #10363 Requested-by: Ethan Jackson <ethan@nicira.com> Signed-off-by: Mehak Mahajan <mmahajan@nicira.com>
bugtool - Collect version information for all running Open vSwitch daemons.
This is an improvement in {ovs|xen}-bugtool archive output to determine the
version which was running for all the OVS daemons. It is implemented as a
plugin which internally calls "ovs-appctl -t <daemon> version" command for
daemons whose pid file is present in /var/run/openvswitch directory.
David S. Miller [Mon, 2 Apr 2012 20:25:21 +0000 (13:25 -0700)]
datapath: Stop using NLA_PUT*().
These macros contain a hidden goto, and are thus extremely error
prone and make code hard to audit.
Signed-off-by: David S. Miller <davem@davemloft.net>
[jesse: Additional transformations for code not upstream.] Signed-off-by: Jesse Gross <jesse@nicira.com>
Linux 3.5 replaces the NLA_PUT_* functions with non-macro version,
which required adding the endian-typed versions. This backports
those functions and drops the macro backports.
Ethan Jackson [Sat, 31 Mar 2012 06:11:05 +0000 (23:11 -0700)]
unixctl.py: Allow callers to manually set unixctl version.
Some clients of unixctl.py may want to use a different version than
the one supplied in ovs.version.VERSION. This patch supports an
optional manual override of this value.
Feature #10383. Signed-off-by: Ethan Jackson <ethan@nicira.com>
Ben Pfaff [Mon, 2 Apr 2012 16:06:11 +0000 (09:06 -0700)]
check-structs: Disallow uint<N>_t because ovs_be<N> should always be used.
The header files that check-structs checks should only contain big-endian
data, never native-endian data, so disallow uint<N>_t entirely. (We had
a couple of mistakes in this area until recently.)
uint8_t is an obvious exception.
Reported-by: Simon Horman <horms@verge.net.au> Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Simon Horman [Fri, 30 Mar 2012 02:30:01 +0000 (11:30 +0900)]
Definitions for Open Flow 1.2
This is a first pass at adding include/openflow/openflow-1.2.h to
include enum and struct definitions for Open Flow 1.2 that
are not already covered by Open Flow 1.1.
Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Simon Horman [Fri, 30 Mar 2012 02:30:00 +0000 (11:30 +0900)]
Add some missing Open Flow 1.1 definitions
Signed-off-by: Simon Horman <horms@verge.net.au>
[blp@nicira.com added OFPRR_GROUP_DELETE to
ofp_flow_removed_reason_to_string()] Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Wed, 28 Mar 2012 21:13:02 +0000 (14:13 -0700)]
Rearrange structures to better fit valgrind's memory leak heuristics.
valgrind's memory leak detector considers a pointer to the head of a memory
block to be "definitely" a pointer to that memory block but a pointer to
the interior of a memory block only "possibly" a pointer to that memory
block. Open vSwitch hmap_node and list data structures can go anywhere
inside a structure; if they are in the middle of a structure then valgrind
considers pointers to them to be possible leaks. Therefore, this commit
moves some of these from the middle of data structures to the head, to
reduce valgrind's uncertainty.
Ben Pfaff [Tue, 27 Mar 2012 22:57:52 +0000 (15:57 -0700)]
Avoid possibly including an old vswitch-idl.h.
Codes that uses #include "vswitch-idl.h" can get an older version of this
header, because this header file moved from vswitchd/ to lib/ and the
older generated file might still be present.
This helps out two ways:
* "make clean" will delete the generated files from their old
locations.
* Use #include "lib/vswitch-idl.h" to explicitly avoid including the
files from their old locations.
Reported-by: Justin Pettit <jpettit@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Tue, 27 Mar 2012 17:20:56 +0000 (10:20 -0700)]
dpif-netdev: Correct type of struct dp_netdev_flow's 'tcp_flags' member.
TCP flags fit in 8 bits so this type seems more appropriate.
Fixes the following "sparse" warnings introduced by commit 734ec5ec1349
(packet: Add additional TCP flags extraction on IPv6.):
dpif-netdev.c:630: warning: incorrect type in assignment (different base types)
dpif-netdev.c:630: expected unsigned char [unsigned] [usertype] tcp_flags
dpif-netdev.c:630: got restricted __be16 [usertype] tcp_flags
dpif-netdev.c:979: warning: invalid assignment: |=
dpif-netdev.c:979: left side has type restricted __be16
dpif-netdev.c:979: right side has type unsigned char
CC: Jesse Gross <jesse@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Simon Horman [Mon, 26 Mar 2012 20:46:35 +0000 (13:46 -0700)]
Add error codes for Open Flow v1.2
* Where Open Flow 1.2 breaks apart error codes defined
in previous versions, provide all new definitions to
previous versions and map the numeric error code to
the first first definition supplied in ofp-errors.h.
The case handled so far is:
OFPERR_OFPBIC_BAD_EXP_TYPE -> { OFPERR_OFPBIC_BAD_EXPERIMENTER,
OFPERR_OFPBIC_BAD_EXP_TYPE }
* Where Open Flow 1.2 adds error codes that were previously
defined as Nicira extension errors define the later in terms
of the new codes.
Signed-off-by: Simon Horman <horms@verge.net.au>
[blp@nicira.com added better error checking in extract-ofp-errors, added
unit tests, miscellaneous cleanup] Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Simon Horman <horms@verge.net.au>
Ben Pfaff [Mon, 26 Mar 2012 18:25:52 +0000 (11:25 -0700)]
ofp-errors: Use OF1.1+ in place of OF1.1 throughout.
In general, I guess that the common case is for most error codes to be
retained without change in future versions of OpenFlow, so to me it seems
best to use the "+" version "by default".
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Simon Horman <horms@verge.net.au>
Jesse Gross [Mon, 26 Mar 2012 19:56:14 +0000 (12:56 -0700)]
packet: Add additional TCP flags extraction on IPv6.
Commit 11460e2316b88f0bd0ea0005d94338d800ea16bd
(flow: Enable retrieval of TCP flags from IPv6 traffic.) updated
one of the TCP flags extraction functions in userspace but missed
the other. This updates that function and converts the other to
use it to reduce duplication.
Raju Subramanian [Fri, 23 Mar 2012 22:49:32 +0000 (15:49 -0700)]
ovs-bugtool: Add ability to prioritize files by date.
When size limit is reached in the middle of processing a dir,
the report ends up containing oldest files. This change adds
an optional param in the plugin to prioritize newer files.
Feature #9937 Requested-by: Ronald Lee <rlee@nicira.com> Signed-off-by: Raju Subramanian <rsubramanian@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Mehak Mahajan [Sat, 10 Mar 2012 23:58:10 +0000 (15:58 -0800)]
Allow configuring DSCP on controller and manager connections.
The changes allow the user to specify a separate dscp value for the
controller connection and the manager connection. The value will take
effect on resetting the connections. If no value is specified a default
value of 192 is chosen for each of the connections.
Jesse Gross [Fri, 23 Mar 2012 20:14:51 +0000 (13:14 -0700)]
flow: Add length check when retrieving TCP flags.
When collecting TCP flags we check that the IP header indicates that
a TCP header is present but not that the packet is actually long
enough to contain the header. This adds a check to prevent reading
off the end of the packet.
In practice, this is only likely to result in reading of bad data and
not a crash due to the presence of struct skb_shared_info at the end
of the packet.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Pravin B Shelar [Fri, 23 Mar 2012 21:14:24 +0000 (14:14 -0700)]
vswitchd: Do not refresh existing iface on new device addition.
There is no need to refresh status and stats for existing devices
if iface mtu is missing in ovs-db as missing MTU could just mean
error in last MTU read for that device.
So we can refresh stats for devices which are just created.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Ben Pfaff [Fri, 23 Mar 2012 18:43:54 +0000 (11:43 -0700)]
treewide: Convert tabs to spaces in C source files written in OVS style.
The Open vSwitch C style doesn't use hard tabs.
This commit doesn't touch files written in kernel style or that are
imported from other projects where we want to minimize changes from
upstream (the sflow files).
Reported-by: Mehak Mahajan <mmahajan@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Thu, 22 Mar 2012 20:24:23 +0000 (13:24 -0700)]
ovsdb-idlc: Fix memory leak in "optional bool" columns.
Commit 1bf2c9096858 (idl: Generalize special case boolean exception.)
changed the IDL to do dynamic allocation with (x)malloc() for optional
booleans, but it didn't add the corresponding calls to free(). This
commit fixes the problem.
Bug #10357. Reported-by: Paul Ingram <paul@nicira.com> Reported-by: Krishna Miriyala <krishna@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>