Ethan Jackson [Tue, 21 Aug 2012 22:44:58 +0000 (15:44 -0700)]
bridge: Deprecate the null interface type.
It's not entirely clear what problem the null interface type is
trying to solve, nor how it could be of use to any controller.
This patch deprecates it, and schedules its removal for February
2013. If there are concerns, please email dev@openvswitch.org.
Ethan Jackson [Sat, 18 Aug 2012 02:14:02 +0000 (19:14 -0700)]
nicira-ext: Deprecate the autopath action.
The autopath action is an obsolete attempt to replicate
functionality contained in the bundle action. It is ugly and of
questionable usefulness. This patch deprecates it and schedules
its removal for February 2013. If there are concerns, please email
dev@openvswitch.org.
Ethan Jackson [Sat, 18 Aug 2012 01:47:43 +0000 (18:47 -0700)]
bond: Deprecate stable bonds.
The stable bond mode is an obsolete attempt to replicate the
functionality contained in the bundle action. They are ugly and of
questionable usefulness. This patch deprecates them and schedules
their removal for February 2013. If there are concerns, please
email dev@openvswitch.org.
Jesse Gross [Wed, 22 Aug 2012 00:48:54 +0000 (17:48 -0700)]
datapath: Fix namespace refcount leak on failed init.
If a datapath fails to initialze fully (likely due to out-of-memory)
then it's possible that we can take a reference to a network
namespace but never release it. This fixes the problem by releasing
any resources in the event of an error.
Found by code inspection, it's likely to be extremely rare in practice.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Ben Pfaff [Sat, 18 Aug 2012 06:27:40 +0000 (23:27 -0700)]
ofproto-dpif: Make sure one-packet flows have zero duration (again).
commit 6a0a5bbbc (ofproto-dpif: Make sure one-packet flows have zero
duration.) was supposed to fix failures in the "ofproto-dpif - NetFlow
flow expiration" test, but it didn't fix the whole problem. That commit
eliminated one reason why a one-packet flow might be shown as having an
nonzero duration, but missed another.
The other reason was that the call to dpif_flow_stats_extract() could
obtain a time later than the time that a new facet was created. (This
wasn't obvious because dpif_flow_stats_extract() obtained the time
internally instead of taking it from the caller.) This commit fixes that
problem, by using the facet creation there too for the first packet in
a facet.
This problem has suddenly started showing up in a lot of builds. I think
it's probably because of the recent change that makes x86-64 skip the timer
optimizations, so that the return value of time_msec() changes every 1 ms,
not just every 100 ms.
I've tested this by running the test in question in a loop for several
minutes, without any failures.
Ben Pfaff [Tue, 21 Aug 2012 20:51:01 +0000 (13:51 -0700)]
bond: Tag flows according to their hash bucket, not just their slave.
The bonding code is supposed to tag flows two ways:
- According to the chosen bond slave, to make it easy to invalidate all
of the flows assigned to a given slave.
- According to the hash value for a flow, to make it easy to invalidate
all of the flows that hash into the same bucket.
However, the code wasn't actually applying the hash-based tags. This
meant that rebalancing didn't take effect immediately, and so after
rebalancing we could get log messages like this:
inconsistency in subfacet (actions were: 5) (correct actions: 4)
specifying some flow that was moved by the rebalance.
This commit fixes the problem by applying the hash-based tags.
Bug #12847. Reported-by: Pratap Reddy <preddy@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ben Pfaff [Mon, 20 Aug 2012 17:29:39 +0000 (10:29 -0700)]
tests: New m4 macro ON_EXIT to add a cleanup action.
Several of the tests start daemons and then need to make sure that the
daemons get killed when the test completes, even if it completes in the
middle due to an early failure. Until now, they have been using manual
shell "trap" calls to do this. This works well enough for simple cases,
but sometimes multiple macros start daemons in a single test, and then
each "trap" has to be carefully written to kill off the daemons for the
previously invoked macros.
This commit introduces a new macro ON_EXIT whose use is composable: each
call appends a new action to the ones already specified.
Ben Pfaff [Wed, 15 Aug 2012 17:16:49 +0000 (10:16 -0700)]
Do not include zeroed metadata fields in NXM/OXM packet-in messages.
NXM and OpenFlow 1.2+ allow including the values of arbitrary flow metadata
in "packet-in" messages. Open vSwitch has until now always included all
the values of the metadata fields that it implements in NXT_PACKET_IN
messages.
However, this has at least two disadvantages:
- Most of the metadata fields tend to be zero most of the time, which
wastes space in the message.
- It means that controllers must be very liberal about accepting
fields that they know nothing about in packet-in messages, since any
switch upgrade could cause new fields to appear even if the
controller does nothing to give them nonzero values. (Controllers
have to be prepared to tolerate unknown fields in any case, but this
property makes unknown fields more likely to appear than otherwise.)
This commit changes Open vSwitch so that metadata fields whose values are
zero are not reported in packet-ins, fixing both problems. (This is
explicitly allowed by OpenFlow 1.2+.)
This commit mainly fixes a sort of internal conceptual dissonance centering
around struct flow_metadata. This structure is supposed to report the
metadata for a given flow. If you look at a flow, it has particular
metadata values; it doesn't have masks, and the idea of a mask for a
particular flow doesn't really make sense. However, struct flow_metadata
did have masks. This led to internal confusion; one can see this in, for
example, the following code removed by this commit in ofproto-dpif.c to
handle misses in the OpenFlow flow table:
/* Registers aren't meaningful on a miss. */
memset(pin.fmd.reg_masks, 0, sizeof pin.fmd.reg_masks);
What this code was really trying to say is that on a flow miss, the
registers are zero, so they shouldn't be included in the packet-in message.
It did manage to omit the registers, by marking them as "wild", but it is
conceptually more correct to simply omit them because they are zero (and
that's one effect of this commit).
Bug #12968. Reported-by: Igor Ganichev <iganichev@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Tue, 14 Aug 2012 20:18:03 +0000 (13:18 -0700)]
tests: Invoke daemons with --no-chdir so core files appear in test dir.
The OVS daemons "cd" to / as a normal part of their startup, since this is
traditional for daemons under Unix. But this also means that, if the
daemons happen to terminate with a core in the unit tests, then the core
file won't be written because / has too-restrictive permissions. (Unless
you run the unit tests as root, or you've got cores configured to go to a
non-standard location.)
This commit fixes the problem by invoking most daemons with --no-chdir so
that the core files go to a test-specific directory. I didn't change
invocations of the Python daemons, since Python doesn't normally terminate
with a core.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Kyle Mestery <kmestery@cisco.com>
Ben Pfaff [Tue, 14 Aug 2012 20:23:22 +0000 (13:23 -0700)]
vswitch.xml: active-backup and balance-slb bonds must be one-sided.
It doesn't work to hook up an active-backup bond on one switch to an
active-backup bond on another switch, because they might pick different
active interfaces and therefore not pass any traffic.
The same is true of balance-slb because multicast and broadcast
traffic is dropped on ingress to any interface other than the active
interface.
Ben Pfaff [Thu, 16 Aug 2012 18:33:21 +0000 (11:33 -0700)]
ofproto-dpif: Avoid dereferencing possibly null or wild pointer.
If ofpacts_len is 0 then ofpacts->type is a bad reference.
(An early draft of ofpacts used an OFPACT_END sentinel so that there was
always data there in this function, but in review the sentinel got deleted
and I did not notice that this function needed an update.)
Found by valgrind.
Bug #12847. Signed-off-by: Ben Pfaff <blp@nicira.com>
Currently, if a controller having a nonzero id registers to get a
OFPR_INVALID_TTL async message, it will not receive it. This is because
compose_dec_ttl() only sent the invalid ttl packets to the default controller
id. NXAST_DEC_TTL_CNT_IDS is a new action that accepts a list of controller
ids, each separated by `,', to which the OFPR_INVALID_TTL packets must be sent.
The earlier requirement of the controller having to explicitly register to
receive these asynchronous messages is retained.
The syntax of this action is:
dec_ttl(id1,id2)
where id1, id2 are valid controller ids.
Ben Pfaff [Mon, 13 Aug 2012 16:30:26 +0000 (09:30 -0700)]
ofproto-dpif: Avoid searching all subfacets when creating first in a facet.
When we create the first subfacet within a facet, we know that there
cannot be an existing subfacet with the same key, so we can skip the search
through the ofproto's table of subfacets.
This is a small optimization, but it should not affect the flow setup rate
in most benchmarks, because in the stressful situations that benchmarks
create, OVS does not set up flows.
Ethan Jackson [Tue, 14 Aug 2012 21:08:09 +0000 (14:08 -0700)]
bridge: Write certain statistics to the database instantly.
Traditionally the bridge has written interface and port statistics
to the database in a rate limited fashion. This makes a lot of
sense for statistics which are either constantly changing, or are
expensive to collect. However, some statistics were rate limited
which have neither of these properties. Furthermore some of these
statistics (most notably carrier) could be very useful to a
controller if updated promptly.
Mehak Mahajan [Wed, 15 Aug 2012 18:19:35 +0000 (11:19 -0700)]
Correct number of bytes to allocated for slaves in bundle action.
The size of each slave is a uint16_t. This means that each slave needs 2 bytes
at the end of nx_action_bundle. Earlier, the size of each slave was not being
factored in when allocating space. This commit corrects that by allocating 2
bytes for each slave when calculating the total number of bytes to be allocated
at the end of nx_action_bundle.
Jesse Gross [Tue, 7 Aug 2012 22:26:33 +0000 (15:26 -0700)]
tests: Add delay in NetFlow unit tests before killing processes.
At the end of the NetFlow unit tests we warp time to force any
remaining flows to expire and then immediately kill OVS and the
collector. However, this creates a race where sometimes these
processes are killed before the last records are sent or collected.
It's possible to force OVS to go through the run loop one last time
before exiting but it's harder to enforce that the collector receives
the packet. This simply avoids the problem by adding a 1 second delay
before killing the processes, which should be more than enough time.
Ethan Jackson [Fri, 10 Aug 2012 23:36:18 +0000 (16:36 -0700)]
cfm: Report opup as undefined if not in extended mode.
The cfm_get_opup() function's result doesn't make sense when CFM is
not configured in extended mode. This patch makes it report -1 in
this case. Future patches will rely on this behavior.
Simon Horman [Thu, 9 Aug 2012 08:49:32 +0000 (17:49 +0900)]
ofp-msgs: Allow 1.0-1.2 range
This is intended for use with OFPRAW_OFPST_TABLE_REQUEST
in order for it to be symmetric with OpenFlow 1.0, 1.1 and 1.2
versions of OFPRAW_OFPST1TABLE_REPLY.
OpenFlow 1.3 introduces yet another format for OFPRAW_OFPST1TABLE_REPLY.
Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Simon Horman [Thu, 9 Aug 2012 08:49:23 +0000 (17:49 +0900)]
ofp-util: Use 32bit port for Open Flow 1.1 & 1.2 Port Mod Request
When encoding Open Flow 1.1 & 1.2 Port Mod Request messages
the port number should be converted to 32bits using
ofputil_port_to_ofp11() rather than htonl(). This ensures
that port numbers in the reserved range are translated correctly.
Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Leo Alterman [Tue, 7 Aug 2012 23:36:27 +0000 (16:36 -0700)]
timeval: On Linux x86-64 systems refresh time whenever it is requested.
64-bit Linux appears to avoid syscalls for clock_gettime(), so we can get
higher resolution timing and avoid having a timer firing off SIGALRM
without introducing extra overhead.
Signed-off-by: Leo Alterman <lalterman@nicira.com>
Leo Alterman [Thu, 9 Aug 2012 00:40:43 +0000 (17:40 -0700)]
lockfile: Remove lockfile_lock timeout argument
lockfile_lock() accepts a timeout argument but, aside from unit tests
pertaining to timeout, its value is always 0. Since this feature relies on
a periodic SIGALRM signal, which is not a given if we're not caching time,
the cleanest solution is just to remove it.
Signed-off-by: Leo Alterman <lalterman@nicira.com>
If we've already reported an error at this point, then we currently report
a no-match error also, but that doesn't add any useful information; it's
just noise in the log.
Ben Pfaff [Thu, 9 Aug 2012 21:18:46 +0000 (14:18 -0700)]
stream-ssl: Seed OpenSSL if it fails to seed itself.
We occasionally see OpenSSL fail to seed its random number generator in
heavily loaded hypervisors. I suspect the following scenario:
1. OpenSSL calls read() to get 32 bytes from /dev/urandom.
2. The kernel generates 10 bytes of randomness and copies it out.
3. A signal arrives (perhaps SIGALRM).
4. The kernel interrupts the system call to service the signal.
5. Userspace gets 10 bytes of entropy.
6. OpenSSL doesn't read again to get the final 22 bytes. Therefore
OpenSSL doesn't have enough entropy to consider itself initialized.
It never tries again, so we're stuck forever.
The only part I'm not entirely sure about is #6, because the OpenSSL code
is so hard to read.
Thanks to Alex Yip for suggesting that this might be a startup problem.
Bug #10164. Reported-by: Ram Jothikumar <ram@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Simon Horman [Thu, 9 Aug 2012 08:49:54 +0000 (17:49 +0900)]
Allow decoding of Open Flow 1.1 & 1.2 Flow Removed Messages
Signed-off-by: Simon Horman <horms@verge.net.au>
[blp@nicira.com added support for hard_timeout, plus a test] Signed-off-by: Ben Pfaff <blp@nicira.com>
Simon Horman [Tue, 7 Aug 2012 21:49:41 +0000 (06:49 +0900)]
ofp-actions: Return action size
Modify ofpacts_put_openflow11_actions() to return the length of actions
appended. This will be used when encoding Packet Out messages for
Open Flow 1.1 and 1.2. The motivation for this is to avoid open coding
the size calculation which may end up being needed elsewhere too.
Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Simon Horman [Tue, 7 Aug 2012 21:49:36 +0000 (06:49 +0900)]
ofp-util: ofputil_pull_ofp11_match: Allow OXM match
* Allow OXM matches which specified in OpenFlow 1.2.
Also allow them for OpenFlow 1.1 as there seems little reason not to.
* Pass padded_match_len parameter which if on NULL will be set to
the padded match len. This will be used when decoding flow statistics
response messages.
Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Justin Pettit [Tue, 7 Aug 2012 22:45:06 +0000 (15:45 -0700)]
vlog: Ignore return value of some write() calls.
A couple of calls to write() would generate warnings when the
"-Wunused-result" compiler option is enabled. This change ignores the
return value, since we can't do anything about it in logging code.
Ben Pfaff [Sat, 21 Jul 2012 16:56:28 +0000 (09:56 -0700)]
tests: Test that ofp10_match bytes that should be ignored really are.
Rob Sherwood reported a bug in OVS treatment of ofp10_match bytes that
should be ignored some time ago:
> In any case, the pktact.SingleWildcardMatch and
> pktact.AllExceptOneWildcardMatch tests were failing because it looks
> like OVS (v1.4 release) was not matching vlan tagged packets when the
> match wildcarded vlan but the dl_vlan value (which should be ignored,
> because it is wildcarded) was non-zero. We've worked around this in
> OFTest by making sure that the dl_vlan value is zero when vlan is
> wildcarded and now the test passes.
>
> In other words:
>
> if (ofp_match->wildcards&OFPFW_DL_VLAN) is true, then the match should
> match both tagged and untagged packets, independent of the value of
> ofp_match->dl_vlan. OVS (seemingly) only matches tagged packets if
> ofp_match->dl_vlan == 0.
I wasn't able to spot the problem at the time, and I still don't see a
problem (perhaps it has been fixed since then), but this commit should
prevent any regression for this specific problem and for anything like it.
It would be natural to modify the parse-ofp11-match test in the same way,
but this commit doesn't do it.
Rob's original bug report is at:
https://mailman.stanford.edu/pipermail/openflow-discuss/2012-March/003107.html
Reported-by: Rob Sherwood <rob.sherwood@bigswitch.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Tue, 7 Aug 2012 18:30:46 +0000 (11:30 -0700)]
learning-switch: Don't use exact-match on every field by default.
OVS has all kinds of odd fields, e.g. registers, and it doesn't make sense
to try to match on all of them. This commit changes learning-switch to
only try to match on the fields defined by OpenFlow 1.0. That's still not
minimal, but it's more reasonable.
This commit should not have an immediately visible effect since
ovs-controller always sends OF1.0 format flows to the switch, and OF1.0
format flows don't have these extra fields. But in the future when we
add support for new protocols and flow formats to ovs-controller, it
will make a difference.
Ben Pfaff [Tue, 7 Aug 2012 17:38:35 +0000 (10:38 -0700)]
learning-switch: Delay sending handshake until version negotiation is done.
The learning-switch implementation needs to know the OpenFlow version in
use to send the initial handshake messages (e.g. the feature request), but
the version is not always available at the time that the code currently
sends the handshake. This can cause an assertion failure later when
ofputil_encode_flow_mod() checks the protocol, which will be 0 if the
version wasn't known.
This commit fixes the problem by introducing a state machine that sends the
handshake messages only after version negotiation has finished.
Reported-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Tue, 24 Jul 2012 23:15:37 +0000 (16:15 -0700)]
learning-switch: Make lswitch own its rconn.
Until now, ovs-controller and the learning-switch code split responsibility
for the OpenFlow connection. This commit moves all the responsibility into
the learning-switch code.
The rationale here is twofold. First, the split itself seems odd; I think
there must have been a reason for it at one time, but I don't remember it
and don't see one anymore. Second, I intend to make the lswitch code more
stateful in upcoming commits, and it seems odd to have the lswitch manage
quite a bit of state but not the entity that that state applies to.
Ben Pfaff [Tue, 7 Aug 2012 18:45:44 +0000 (11:45 -0700)]
vconn: Ensure that vconn_run() is enough to complete a connection.
Until now, it seems that all vconn users have immediately started reading
messages from the connection. Today, however, I added a new user that
only wants to read packets after the OpenFlow version is negotiated, so
it never called vconn_recv() before that happened. It turns out that if
you do this, the version never gets negotiated at all.
This commit fixes the problem by ensuring that vconn_run() will continue
version negotiation if it isn't done yet.
This changes the error return that I get for Unix sockets in the
test-vconn "accept-then-close" test from EPIPE to ECONNRESET, so this
commit also adjusts that test to accept either error code; both of them
seem reasonable enough to me.
Mehak Mahajan [Tue, 7 Aug 2012 19:40:23 +0000 (12:40 -0700)]
Adding checksum to ICMP packets created by OVS for testing.
OVS provides a utility to create ICMP packets for the purpose of
testing using ovs-appctl netdev-dummy/receive. These packets created
by flow_compose() earlier did not have the ICMP checksum in them.
With this commit, the checksum will be added to these test ICMP
packets.