Justin Pettit [Wed, 8 Dec 2010 05:57:09 +0000 (21:57 -0800)]
ofp-print: Print Nicira error extension messages.
Currently, Nicira error messages are non-overlapping with the OpenFlow
error definitions. This commit takes advantage of that by not taking
into account the vendor id. Printing error messages is likely to be
overhauled before long, and a more general approach can be taken then.
Ben Pfaff [Wed, 8 Dec 2010 20:05:20 +0000 (12:05 -0800)]
ofp-print: Print each flow at the start of a line.
Before this commit, the first flow in "ovs-ofctl dump-flows" output was
printed on the same line as the OpenFlow message type name and the xid.
With this commit, that flow is put on a line of its own, like all of the
other flows in the output.
Requested-by: Paul Ingram <paul@nicira.com> CC: Paul Ingram <paul@nicira.com>
Ben Pfaff [Tue, 7 Dec 2010 20:17:03 +0000 (12:17 -0800)]
odp-util: Bump up maximum number of ODP actions.
The kernel supports more than a single page of actions now, so userspace
should be able to take advantage of this.
Upcoming commits will completely replace this data structure but this
commit makes the bug fix clear and is suitable for cherry-picking to
long-term support branches.
Ben Pfaff [Tue, 7 Dec 2010 00:11:55 +0000 (16:11 -0800)]
ofp-util: Fully initialize flow_wildcards in ofputil_cls_rule_from_match().
The new 'zero' member was not being properly initialized. One approach
would be to add an assignment, but it seems more future-proof to let
flow_wildcards_init_catchall() do the right thing.
The old formatting was only good enough for debugging, but now we need to
be able to format cls_rules as part of ofp-print.c. This new code is
modeled after ofp_match_to_string().
Ben Pfaff [Tue, 7 Dec 2010 20:45:24 +0000 (12:45 -0800)]
ofp-util: New abstractions for flow_mod, flow_stats_request.
These will be useful for adding Nicira Extended Match support to ovs-ofctl.
This commit makes ofproto use the new flow_mod abstraction, but not the
new flow and aggregate stats abstraction. The latter takes a bit more
infrastructure that I haven't finished yet.
Ben Pfaff [Thu, 2 Dec 2010 22:15:33 +0000 (14:15 -0800)]
nicira-ext: Clarify and fix macros to check for NXM metadata registers.
The NXM_IS_NX_REG macro didn't check the "hasmask" bit, which meant that it
looked like it was supposed to match both exact and wildcarded NXM headers,
e.g. both NXM_NX_REG0 and NXM_NX_REG0_W. But exact and wildcarded NXM
headers differ not just in the "hasmask" bit but in the "length" value
also (the wildcarded version's length is twice the exact version's length),
so this was not what it actually did.
The only current users of NXM_IS_NX_REG actually only want to match exact
versions, so this commit makes it only match those. It also adds a new
NXM_IS_NX_REG_W macro that matches only wildcarded versions. This new
macro has no users yet, but its existence should help to make it clear that
NXM_IS_NX_REG only matches exact NXM headers.
Ethan Jackson [Sat, 4 Dec 2010 00:49:02 +0000 (16:49 -0800)]
vswitchd: Remove bond/migrate MAC argument.
Before this patch one could specify a mac address as part of the
bond/migrate command. This will no longer make sense as bond
hashing becomes more complicated.
Ben Pfaff [Mon, 6 Dec 2010 18:20:20 +0000 (10:20 -0800)]
Refactor and centralize basic OpenFlow message decoding and validation.
Open vSwitch contains a few different chunks of code that need to decode
an OpenFlow message to determine its type and then validate that it is
long enough. Until now, the code for doing this has been more or less
scattered across the tree. Whenever a new piece of code needed to do this,
it generally needed to reimplement at least part of it.
This commit centralizes all of that work into a single function,
ofputil_decode_msg_type(), and helper functions, and converts all of the
code that was decoding messages by hand to use the new function.
Ben Pfaff [Tue, 23 Nov 2010 21:20:17 +0000 (13:20 -0800)]
pinsched: Use hmap instead of port_array.
This is the last remaining use of port_array in the tree. It wasn't really
taking advantage of any of the special features of port_arrays, so it's
better to save some time and memory by using an hmap instead.
(In addition, OpenFlow 1.1, which we may eventually want to support, has
changed from 16-bit to 32-bit port numbers, which would require the
port_array code to be rewritten anyhow.)
Ben Pfaff [Mon, 6 Dec 2010 18:03:31 +0000 (10:03 -0800)]
queue: Get rid of ovs_queue data structure.
ovs_queue doesn't seem very useful; it's just a singly-linked list. It's
more generally useful to use a general-purpose "struct list" for lists of
packets, so this commit adds such a member to "struct ofpbuf" and shifts
the existing users to use it.
Ben Pfaff [Mon, 6 Dec 2010 17:56:38 +0000 (09:56 -0800)]
docs: Only regenerate vswitch.pic when the schema really changes.
Until now, vswitch.pic has been rebuilt whenever the schema changed. This
is OK when the E-R diagram would really change, but many changes to the
schema don't change the E-R diagram, and it surprises people when
vswitch.pic changes in such a situation. This commit fixes the problem.
Checksum offloading has changed quite a bit across different kernel
and Xen versions. Since it is part of the skb data structure it is
unfortunately difficult to separate out into compatibility code.
This consolidates all of the checksum code in one place which makes
it easier read and remove as we prepare for upstreaming. On newer
kernels it also puts everything in inline functions, eliminating the
need to run through the compat code or make extra function calls.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ethan Jackson [Fri, 3 Dec 2010 23:43:24 +0000 (15:43 -0800)]
lib: Add zero padding field to flow_wildcards.
Before this commit, the compiler would add two bytes of padding to
the 'flow_wildcards' structure to achieve 32bit alignment. These
two bytes had inconsistent values which caused 'flow_wildcards_hash'
to behave inconsistently. This commit explicitly 32bit aligns
'flow_wildcards' with zero padding.
This commit also fixes an issue where in-band rules were not
getting deleted when in-band control was disabled.
Ben Pfaff [Fri, 3 Dec 2010 21:09:26 +0000 (13:09 -0800)]
datapath: Merge "struct dp_port" into "struct vport".
After the previous commit, which changed the datapath to always create and
attach a vport at the same time, and to always detach and delete a vport
at the same time, there is no longer any real distinction between a dp_port
and a vport. This commit, therefore, merges the two together to simplify
code. It might even improve performance, although I have not checked.
I wasn't sure at first whether the merged structure should be "struct
dp_port" or "struct vport". I went with the latter since the "v" prefix
sounds cool.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
Ben Pfaff [Fri, 3 Dec 2010 20:54:08 +0000 (12:54 -0800)]
netdev-linux: Don't treat "system" devices as vports for setting stats.
Linux kernel datapath vports have a "set_stats" method. Until now,
internal vports have been handled in the userspace netdev library as
type "system", so the "system" netdevs would try to use the vport
"set_stats" method. Now, however, internal netdevs have been broken out
as a separate netdev type, so only that new type of netdev has to be able
to call into "set_stats". This commit, therefore, removes it from the
"system" netdevs.
Ben Pfaff [Fri, 3 Dec 2010 22:41:38 +0000 (14:41 -0800)]
datapath: Make adding and attaching a vport a single step.
For some time now, Open vSwitch datapaths have internally made a
distinction between adding a vport and attaching it to a datapath. Adding
a vport just means to create it, as an entity detached from any datapath.
Attaching it gives it a port number and a datapath. Similarly, a vport
could be detached and deleted separately.
After some study, I think I understand why this distinction exists. It is
because ovs-vswitchd tries to open all the datapath ports before it tries
to create them. However, changing it to create them before it tries to
open them is not difficult, so this commit does this.
The bulk of this commit, however, changes the datapath interface to one
that always creates a vport and attaches it to a datapath in a single step,
and similarly detaches a vport and deletes it in a single step.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
Ben Pfaff [Thu, 4 Nov 2010 23:32:57 +0000 (16:32 -0700)]
datapath: Encapsulate parameters for new vports in new struct vport_parms.
Upcoming commits will keep needing to pass more information to the vport
'create' member function. It's annoying to have to modify a dozen pieces
of code every time just to do this, so this commit encapsulates all of the
parameters in a new struct and passes that instead.
Ben Pfaff [Fri, 3 Dec 2010 18:32:38 +0000 (10:32 -0800)]
ovs-ofctl: Reimplement dumping particular tables.
"dump-flows" and "dump-aggregate" are documented to accept a "table"
value to dump only a particular OpenFlow table, but commit 8050b31d6
"ofp-parse: Refactor flow parsing" broke this, by always dumping table
0. This commit should fix it (though I haven't tested it).
Ben Pfaff [Wed, 1 Dec 2010 00:49:10 +0000 (16:49 -0800)]
datapath: Add __aligned_u64 compat support for user and kernel headers.
__aligned_u64 is a 64-bit integer type that is guaranteed to be aligned on
a 64-bit boundary. It is used in ABI structures to allow them to be shared
between 32- and 64-bit userspace without the need for kernel compat code.
The first use in OVS is coming up in this series of patches.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
Ben Pfaff [Tue, 30 Nov 2010 23:58:55 +0000 (15:58 -0800)]
datapath: Change vals[] in struct port_lookup_key into discrete members.
The 'vals' array is only convenient for use by port_hash(). It's a
liability otherwise, since it makes the code wider and harder to read and
seems to me less amenable to compiler optimization.
In an upcoming patch the key needed in struct port_lookup_key will
increase in size to 64 bits, so that using an array of u32 becomes even
more problematic. Therefore, this commit gets rid of the array in favor
of discrete named members that carry the same information.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
Ben Pfaff [Wed, 1 Dec 2010 00:54:50 +0000 (16:54 -0800)]
Format tunnel IDs consistently.
Some code failed to convert tunnel IDs to host byte order for printing,
so this fixes that. Some code printed tunnel IDs with a 0x prefix and
other code didn't, so this code uses the '#' flag consistently (which
prints 0x for nonzero values and omits it for zero).
This commit also stops always printing all 8 digits. When tunnel IDs
are expanded to 64 bits, as they will be soon, printing 16 digits all the
time wastes too much space.
Jesse Gross [Wed, 24 Nov 2010 01:03:16 +0000 (17:03 -0800)]
datapath: Allow skbs with a frag list.
We can already receive packets with a frag list due to reassembly
in CAPWAP tunneling. Since we can handle it, we might as well open
it up to internal devices as well to prevent linearization.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Jesse Gross [Wed, 24 Nov 2010 00:46:47 +0000 (16:46 -0800)]
datapath: Don't set dev->last_rx on kernels >= 2.6.29.
dev->last_rx is used for rebalancing in Linux bonding. However,
on a SMP machine it quickly becomes a very hot cacheline. On
kernels 2.6.29 and later the networking core will update last_rx
only if bonding is in use, so drivers do not need to set it at all.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Jesse Gross [Wed, 24 Nov 2010 06:08:27 +0000 (22:08 -0800)]
datapath: Constify ops structures.
vport_ops, tunnel_ops, and ethtool_ops should not change at runtime.
Therefore, mark them as const to keep them out of the hotpath and to
prevent them from getting trampled.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Jesse Gross [Wed, 24 Nov 2010 17:27:10 +0000 (09:27 -0800)]
datapath: Provide compatibility code for SET_ETHTOOL_OPS constness.
On 2.6.18 dev->ethtool_ops was not marked as const. This adds a
compatibility macro that casts away the constness so we can mark
our ethtool ops as const on later kernels.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Jesse Gross [Wed, 24 Nov 2010 06:35:15 +0000 (22:35 -0800)]
datapath: Add compatibility code for inet_add_protocol().
Kernels earlier than 2.6.32 did not mark struct net_protocol as const
in inet_add_protcol() and inet_del_protocol(). This provides compatibility
code to cast away the constness on these kernels so we can have them be
const on newer kernels.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Jesse Gross [Wed, 24 Nov 2010 00:34:22 +0000 (16:34 -0800)]
datapath: Use __read_mostly annotations where appropriate.
Variables which are changed only infrequently should be annotated
with __read_mostly, which will group them together in a special
linker section. This prevents them from sharing cache lines with
data on the hot path.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Jesse Gross [Sat, 20 Nov 2010 01:20:34 +0000 (17:20 -0800)]
datapath: Don't unnecessarily set skb mac header.
We currently call skb_reset_mac_header() in a few places when a
packet is received. However, this is not needed because flow_extract()
will set all of the protocol headers during parsing and nothing needs
the packet headers before that time.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Jesse Gross [Fri, 19 Nov 2010 21:55:18 +0000 (13:55 -0800)]
datapath: Remove share check for internal devices.
When transmitting on a device, dev_hard_start_xmit() always provides
a private clone. The skb_share_check() in internal_dev_xmit() is
therefore unnecessary, so remove it.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Jesse Gross [Fri, 19 Nov 2010 21:49:54 +0000 (13:49 -0800)]
vport: Remove unused error types.
We currently track rx_over_errors, rx_crc_errors, rx_frame_errors,
and collisions but never increment these counters. It seems likely
that we will never use them since they are primarily hardware errors
and we pull hardware stats directly from the NIC. This removes those
counters, saving 32 bytes per port.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Jesse Gross [Fri, 19 Nov 2010 21:19:55 +0000 (13:19 -0800)]
datapath: Drop obsolete comment.
The comment above flow_extract() refers to setting OVS_CB(skb)->is_frag
but that member no longer exists. The correct way to set is_frag is
already documented, so just drop the incorrect comment.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Jesse Gross [Thu, 2 Dec 2010 21:07:36 +0000 (13:07 -0800)]
tunneling: Clear IP control block in one memset.
We currently clear both the members of the IPCB individually before
entering the IP stack. It's simpler and more robust to just clear
the entire structure.
Suggested-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
Jesse Gross [Fri, 19 Nov 2010 21:10:14 +0000 (13:10 -0800)]
tunneling: Clear OVS_CB after call to update_header().
If a packet is traversing the IP stack we need to clear some pieces
of the skb CB beforehand. We currently do this before the call to
update_header() but header generation may need some members of the
CB, such as the key. Therefore, zero out the CB only after the
header is complete.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Tue, 16 Nov 2010 18:50:52 +0000 (10:50 -0800)]
ovs-ofctl: Check that commands actually succeed.
Until now, when it sends commands to switches that ordinarily have no
reply, ovs-ofctl has not waited around to see whether the command succeeds
or fails. This commit fixes the problem: errors will now be reported.
Ben Pfaff [Wed, 1 Dec 2010 19:03:12 +0000 (11:03 -0800)]
dot2pic: Be less picky parsing "dot" output.
Some versions of "dot" put two spaces after the "node" keyword instead of
one, which didn't match the regular expression used in dot2pic. This
commit changes dot2pic not to care about the number of spaces in "node" and
"graph" lines. (The "graph" lines weren't actually a problem but I don't
see a reason to be picky about them either.)
Different versions of "dot" still produce different output for the same
input, but I don't see how to avoid that.
Ben Pfaff [Tue, 30 Nov 2010 21:44:01 +0000 (13:44 -0800)]
Implement stress option framework.
Stress options allow developers testing Open vSwitch to trigger behavior
that otherwise would occur only in corner cases. Developers and testers
can thereby more easily discover bugs that would otherwise manifest only
rarely or nondeterministically. Stress options may cause surprising
behavior even when they do not actually reveal bugs, so they should only be
enabled as part of testing Open vSwitch.
This commit implements the framework and adds a few example stress options.
This commit started from code written by Andrew Lambeth.
Suggested-by: Henrik Amren <henrik@nicira.com> CC: Andrew Lambeth <wal@nicira.com>
Ben Pfaff [Mon, 1 Nov 2010 21:14:27 +0000 (14:14 -0700)]
coverage: Make the coverage counters catalog program-specific.
Until now, the collection of coverage counters supported by a given OVS
program was not specific to that program. That means that, for example,
even though ovs-dpctl does not have anything to do with mac_learning, it
still has a coverage counter for it. This is confusing, at best.
This commit fixes the problem on some systems, in particular on ones that
use GCC and the GNU linker. It uses the feature of the GNU linker
described in its manual as:
If an orphaned section's name is representable as a C identifier then
the linker will automatically see PROVIDE two symbols: __start_SECNAME
and __end_SECNAME, where SECNAME is the name of the section. These
indicate the start address and end address of the orphaned section
respectively.
Systems that don't support these features retain the earlier behavior.
This commit also fixes the annoyance that files that include coverage
counters must be listed on COVERAGE_FILES in lib/automake.mk.
This commit also fixes the annoyance that modifying any source file that
includes a coverage counter caused all programs that link against
libopenvswitch.a to relink, even programs that the source file was not
linked into. For example, modifying ofproto/ofproto.c (which includes
coverage counters) caused tests/test-aes128 to relink, even though
test-aes128 does not link again ofproto.o.
Ben Pfaff [Mon, 1 Nov 2010 17:47:29 +0000 (10:47 -0700)]
netdev-linux: Remove counter double-increments.
A few coverage counters were incremented both in netdev generic code and
in netdev_linux code. This commit drops the increments from the
lower-level code.
(This is not an actual bug because these counters are used only for
logging.)
Ben Pfaff [Tue, 30 Nov 2010 18:29:25 +0000 (10:29 -0800)]
vswitch: Update dia-generated diagram.
This probably didn't get updated automatically because the last update to
vswitch.ovsschema was made by a developer without one of the required
tools installed.
Ben Pfaff [Tue, 30 Nov 2010 01:09:53 +0000 (17:09 -0800)]
cfm: Fix GCC warning.
On 32-bit platforms GCC warns:
../lib/cfm.c: In function 'compose_ccm':
../lib/cfm.c:130: warning: integer constant is too large for 'long' type
../lib/cfm.c: In function 'cfm_should_process_flow':
../lib/cfm.c:375: warning: integer constant is too large for 'long' type
This fixes the problem by using the UINT64_C macro from <inttypes.h> to
write a 64-constant.
Ben Pfaff [Mon, 15 Nov 2010 23:53:00 +0000 (15:53 -0800)]
unaligned: Add unaligned accessors for ovs_be<N> data.
These accessors are semantically identical to the ones for uint<N>_t data,
but the names are more informative to readers, and the types provide
annotations for sparse.
Ben Pfaff [Mon, 29 Nov 2010 20:28:26 +0000 (12:28 -0800)]
Make installation directories overridable at runtime.
This makes it possible to run tests that need access to installation
directories, such as the rundir, without having access to the actual
installation directories (/var/run is generally not world-writable), by
setting environment variables. This is not a good way to do things in
general--usually it would be better to choose the correct directories
at configure time--so for now this is undocumented.
This commit implements a subset of the 802.1ag specification for
Connectivity Fault Management (CFM) using Continuity Check Messages
(CCM). When CFM is configured on an interface CCMs are broadcast
at regular intervals to detect missing or unexpected connectivity.
Ben Pfaff [Mon, 29 Nov 2010 22:08:29 +0000 (14:08 -0800)]
flow: Delete unused FWW_VLAN_TCI bit.
This wasn't used intentionally anywhere, but some code was turning it on
accidentally (because it was part of FWW_ALL) and other code was not, which
caused confusion. In particular, the NXM code turned it on by default
and the OpenFlow 1.0 code did not, which caused flow stat requests to
return different results depending on format. Deleting it fixes the bug.