Simon Horman [Wed, 24 Aug 2011 01:40:52 +0000 (10:40 +0900)]
docs: Suppress "warning: macro `DD' not defined" warning
Suppress "warning: macro `DD' not defined" warning for ovs-brcompatd.8.
As per the description by Ben Pfaff for the same problem effecting
other files:
deamon.man allows the file that is including it to include extra
text in the description of --detach by defining a macro named DD.
Only some of the manpages that included it did this (only those
manpages that needed extra text there). But it's better to be
quiet in "man --warnings", so this defines DD to an empty value in
the other manpages that include daemon.man.
Simon Horman [Wed, 24 Aug 2011 01:40:51 +0000 (10:40 +0900)]
Debian: Add dependency on ${misc:Depends}
Add dependency on ${misc:Depends} to openvswitch-brcompat and ovsdbmonitor.
As reported by Lintian:
The source package uses debhelper, but it does not include
${misc:Depends} in the given binary package's debian/control entry.
Any debhelper command may add dependencies to ${misc:Depends} that
are required for the work that it does, so recommended best
practice is always add ${misc:Depends} to the dependencies of each
binary package if debhelper is in use.
Refer to the debhelper(7) manual page for details.
Justin Pettit [Thu, 18 Aug 2011 17:35:40 +0000 (10:35 -0700)]
datapath: Use "OVS_*" as opposed to "ODP_*" for user<->kernel interactions.
The prefix "ODP_*" is not overly descriptive in the context of the
larger Linux tree. This commit changes the prefix to "OVS_*" for the
userpace to kernel interactions. The userspace libraries still use
"ODP_" in many of their interfaces since it is more descriptive in the
OVS oeuvre.
Ben Pfaff [Fri, 19 Aug 2011 21:29:27 +0000 (14:29 -0700)]
ofproto-dpif: Delete MAC learning entries when they expire.
Commit fa066f015f716c7 "bridge: Move packet processing functionality into
ofproto" deleted the call to mac_learning_run() that deletes MAC learning
table entries when they expire. This fixes the problem.
Justin Pettit [Thu, 18 Aug 2011 16:57:23 +0000 (09:57 -0700)]
datapath: Correct comment for vport_add().
The comment describing vport_add() incorrectly stated that the function
added the vport to the datapath. It is the responsibility of the caller
to do that.
Ben Pfaff [Thu, 18 Aug 2011 17:33:32 +0000 (10:33 -0700)]
odp-util: Fix parsing of Ethertypes 0x8000 and above.
An existing comment in the function being updated explains the problem:
* Many of the sscanf calls in this function use oversized destination
* fields because some sscanf() implementations truncate the range of %i
* directives, so that e.g. "%"SCNi16 interprets input of "0xfedc" as a
* value of 0x7fff. The other alternatives are to allow only a single
* radix (e.g. decimal or hexadecimal) or to write more sophisticated
* parsers.
Ethan Jackson [Tue, 16 Aug 2011 21:16:58 +0000 (14:16 -0700)]
ofproto: Fix over accounting of byte counters.
The update_stats() function in ofproto was attributing more bytes
to facets than they had actually accrued. This could potentially
throw off bond load balancing.
Ethan Jackson [Tue, 16 Aug 2011 20:50:00 +0000 (13:50 -0700)]
ofproto: Clear packet and byte counters on flow additions.
When a flow is added to the flow table, its packet and byte
counters should be reset. This patch efficiently approximates this
behavior. It also does some minor code refactoring.
Ben Pfaff [Wed, 10 Aug 2011 19:49:35 +0000 (12:49 -0700)]
poll-loop: Remove static variable n_waiters.
It's always a little risky to track the length of a list by hand, because
it is easy to miss a spot where the length can change. So it seems like
a small cleanup to just measure the length of the 'waiters' list at the
point where we need to know it. list_size() is O(n) in the length of the
list, but the function that calls it is already O(n) in that length so it
seems like a fair trade-off.
Ethan Jackson [Wed, 10 Aug 2011 20:32:51 +0000 (13:32 -0700)]
nx-match: New function nxm_read_field_bits().
nxm_read_field_bits() simplifies reading of NXM fields with an
ofs_nbits parameter. This patch updates nxm_execute_reg_move() to
use the new function. A user outside of the nx-match module will
be added in future patches.
Ethan Jackson [Fri, 29 Jul 2011 20:15:09 +0000 (13:15 -0700)]
flow: New FLOW_WC_SEQ build assertion.
Changing "struct flow" or its wildcards requires minor adjustments
in many places in the code. This patch adds a new FLOW_WC_SEQ
sequence number which when incremented will cause build assertion
failures aiding the developer in finding code which needs to
change.
Ben Pfaff [Wed, 10 Aug 2011 16:13:12 +0000 (09:13 -0700)]
debian: Avoid installing duplicate files in ovsdbmonitor package.
This is just a typo introduced in commit 57483aeda (debian: Fix bug from
commit 211b05b5 "debian: Modernize use of dh_install.) that caused the
ovsdbmonitor package to install too many files.
Ben Pfaff [Tue, 9 Aug 2011 20:46:51 +0000 (13:46 -0700)]
nicira-ext: Fix NXM example.
The code and the specification say that nxm_length includes both value
and mask, but this example showed nxm_length only including the value.
This commit fixes it.
Ben Pfaff [Tue, 9 Aug 2011 19:55:13 +0000 (12:55 -0700)]
ovs-ofctl: Fix reading flows from file for "replace-flows", "diff-flows".
Commit c821124b25e "ovs-ofctl: Accept only valid flow_mod and
flow_stats_request fields" caused actions read by read_flows_from_file()
to be ignored and treated as "drop". This fixes the problem.
Sanjay Sane [Tue, 9 Aug 2011 18:08:27 +0000 (11:08 -0700)]
Option to forward BPDU (Ethernet control class) frames
Currently, a NORMAL action bridge drops reserved-multicast-mac addresses;
01-80-c2-00-00-[f0:ff]. A node that does not implement STP should have an
option to forward such frames.
This commit proposes to have a configuration option to allow forwarding of
BPDU class frames. To ensure backward compatibility, this option is
disabled by default.
This config can be set using bridge's other-config column, for e.g
ovs-vsctl set bridge br0 other-config:forward-bpdu=true
Changing this option can revalidate all flows in a software-OVS
implementation (ofproto-dpif)
--------
unit tests:
------------
make config changes, test runtime behavior
-- test runtime behavior --
continuously send packets to br0 with dest-mac=01:80:c2:00:00:00
Ben Pfaff [Thu, 28 Jul 2011 22:01:20 +0000 (15:01 -0700)]
ofproto: Allow ->rule_choose_table() to be NULL regardless of table count.
In the upcoming software switch implementation of multiple tables, there is
no reason to prefer one table over another, so we always put rules into
table 0 by default. This commit allows this to be done simply by
specifying NULL as ->rule_choose_table().
Ben Pfaff [Thu, 28 Jul 2011 21:19:52 +0000 (14:19 -0700)]
ofproto: Make ->construct() and ->destruct() more symmetrical.
The ofproto_provider's ->construct() was required to allocate and
initialize the flow tables, but ->destruct() was not allowed to
uninitialize and free them. This arrangement is oddly asymmetrical (and
undocumented), so this commit changes the code so that the client is
responsible for both allocation and freeing.
Ben Pfaff [Mon, 8 Aug 2011 22:43:29 +0000 (15:43 -0700)]
ofproto-dpif: Make ofproto/trace accept an odp_flow in place of a packet.
It's often easier to get a flow than it is to get a packet. For example,
"ovs-dpctl dump-flows" prints flows, but it doesn't print any of the
packets in those flows. It is also easier to construct a flow by hand than
it is to construct packet bytes.
This commit, therefore, makes ofproto/trace accept a flow specification in
place of packet data. In a few cases a flow by itself is not sufficient
to determine what would happen to a packet, so in those cases the command
prints a message to that effect.
An upcoming commit will also start using this feature to unit-test action
translation in ofproto-dpif.
Ben Pfaff [Thu, 4 Aug 2011 20:32:19 +0000 (13:32 -0700)]
odp-util: Format VLAN headers more like other headers in ODP flow output.
The rest of the headers all follow the form "header(value)" or
"header(key1=value1,key2=value2,...)" but VLAN headers left out the "="
characters. This adds them in for consistency.
Ben Pfaff [Mon, 8 Aug 2011 19:49:17 +0000 (12:49 -0700)]
netdev: Decouple creating and configuring network devices.
Until now, each call to netdev_open() for a particular network device
had to either specify a set of network device arguments that was either
empty or (for devices that already existed) equal to the existing device's
configuration. Unfortunately, the definition of "equality" in the latter
case was mostly done in terms of strict equality of string-to-string maps,
which caused problems in cases where, for example, one set of arguments
specified the default value of an optional argument explicitly and the
other omitted it.
The netdev interface does have provisions for defining equality other ways,
but this had only been done in one case that was especially problematic in
practice. One way to solve this particular problem would be to carefully
define equality in all the problematic cases.
This commit takes another approach based on the realization that there is
really no need to do any comparisons. Instead, it removes configuration
at netdev_open() time entirely, because almost all of netdev_open()'s
callers are not interested in creating and configuring a netdev. Most of
them just want to open a configured device and use it. Therefore, this
commit stops providing any configuration arguments to netdev_open() and the
provider functions that it calls. Instead, a caller that does want to
configure a device does so after it opens it, by calling
netdev_set_config().
This change allows us to simplify the netdev interface a bit. There is no
longer any need to implement argument comparisons. As a result, there is
also no need for "struct netdev_dev" to keep track of configuration at all.
Instead, the network devices that have configuration keep track of it in
their own internal form.
This new interface does mean that it becomes possible to accidentally
create and try to use an unconfigured netdev that requires configuration.
Bug #6677. Reported-by: Paul Ingram <paul@nicira.com>
Ben Pfaff [Fri, 5 Aug 2011 21:15:32 +0000 (14:15 -0700)]
netdev: Clean up and refactor packet receive interface.
The Open vSwitch tree only has one user of the ability for a netdev to
receive packets from a network device. Thus, this commit simplifies the
common-case use of the netdev interface by replacing the "ethertype" option
from "struct netdev_options" by a new netdev_listen() call.
The only user of netdev_listen() wants to receive all packets from a
network device, so this commit also removes the ability to restrict the
received packets to a particular protocol. (This ability was once used by
the Open vSwitch integrated DHCP client, but that code has been removed.)
This commit also simplifies and improves the implementation of the code
in netdev-linux that started listening to a network device. Before, I had
not figured out how to avoid receiving all packets on all devices before
binding to a particular device, but I took a closer look at the kernel code
and figured it out.
I've tested that the userspace datapath (dpif-netdev), the only user of
netdev_recv(), still works after this change.
Ben Pfaff [Fri, 5 Aug 2011 21:14:18 +0000 (14:14 -0700)]
bridge: Add port to datapath before trying to retrieve device stats.
Virtual ports such as GRE tunnels don't exist until after the port is
added to the datapath, so without this change adding such a port yields
a warning like the following:
netdev|WARN|failed to retrieve MTU for network device gre0: No such device
Ben Pfaff [Fri, 5 Aug 2011 22:56:05 +0000 (15:56 -0700)]
ovsdb: Correct specification inconsistency between "lock" and "assert".
The "lock" request requires the lock name to be an <id> but it is shown as
<string> in the "assert" operation. This corrects the "assert"
specification and fixes the suggested naming convention (since ":" is not
valid in an <id>).
This commit also updates the implementation to match the specification.
Ben Pfaff [Fri, 5 Aug 2011 22:48:45 +0000 (15:48 -0700)]
ofp-util: Rewrite action decoding to improve compiler warnings.
The previous implementation of ofputil_decode_action() had two weaknesses.
First, the action lengths in its tables were written as literal integers
instead of using "sizeof". Second, it used arrays for tables instead of
switch statements, which meant that GCC didn't warn when someone added a
new action type but forgot to add an entry to the tables.
Ben Pfaff [Fri, 5 Aug 2011 23:58:02 +0000 (16:58 -0700)]
Document and warn that mirroring to a VLAN is incompatible with SLB bonding.
vswitchd/INTERNALS explains the incompatibility:
2. When Open vSwitch forwards a multicast or broadcast packet to a
link in the SLB bond other than the active slave, the remote
switch will forward it to all of the other links in the SLB
bond, including the active slave. Without special handling,
this would mean that Open vSwitch would forward a second copy of
the packet to each switch port (other than the bond), including
the port that originated the packet.
Open vSwitch deals with this case by dropping packets received
on any SLB bonded link that have a source MAC+VLAN that has been
learned on any other port. (This means that SLB as implemented
in Open vSwitch relies critically on MAC learning. Notably, SLB
is incompatible with the "flood_vlans" feature.)
We could go farther than this and automatically change the bonding mode to
a safer one (e.g. active-backup) when flood_vlans are enabled. However,
that would still leave the SLB fallback for LACP modes in place; perhaps
active-backup would have to be the fallback for LACP modes when flood_vlans
are enabled.
Ben Pfaff [Mon, 8 Aug 2011 16:35:01 +0000 (09:35 -0700)]
ChangeLog: Rename NEWS.
Many users seem to miss changes that Open vSwitch clearly states in
ChangeLog. Some of this is probably due to users not reading any
documentation ever, but I wonder whether some of it is because users
expect ChangeLog to be akin to a commit log, because that's what GNU
software puts in ChangeLog. Users don't read GNU ChangeLogs, because
they don't contain information useful to users.
GNU software puts what we have in ChangeLog in a file named NEWS. Some
users read it. Therefore, this commit renames ChangeLog to NEWS.
Also, strip trailing whitespace for Ethan-compliance.
Justin Pettit [Tue, 2 Aug 2011 04:18:00 +0000 (21:18 -0700)]
ovs-appctl: Add "version" command to print version of running daemons.
When debugging a running system, we occasionally see a mismatch of
different versions because someone forgets to restart one or more
daemons. Often times, it would be useful to know what's running as
opposed to what's on the current runpath.
Ben Pfaff [Wed, 3 Aug 2011 22:01:11 +0000 (15:01 -0700)]
in-band: Delete remaining rules when disabling in-band control.
in_band_destroy() doesn't remove all of the rules that in-band control
adds (and it cannot, because that might require waiting for an existing
asynchronous flow modification or addition to complete), so turning on
other-config:disable-in-band or deleting all of the OpenFlow controllers
did not delete all of the in-band rules.
This commit fixes the problem by making the in-band control object hang
around until all of the flows that it set up have actually been deleted.
This problem was introduced as part of commit 7ee20df "ofproto: Implement
asynchronous OFPT_FLOW_MOD commands."
Ben Pfaff [Wed, 3 Aug 2011 21:31:54 +0000 (14:31 -0700)]
connmgr: Drop 'next_in_band_update' timer.
This timer used to exist because it was possible for in-band remote IP
addresses to change without any intentional configuration change in one
case: where controller discovery found a new controller. Controller
discovery was removed long ago, but the reason for the timer had been
forgotten and so remained. This commit removes it.
Simon Horman [Wed, 3 Aug 2011 02:19:47 +0000 (11:19 +0900)]
datapath: Allow the number of hash entries to exceed TBL_MAX_BUCKETS
* If the number of entries in a table exceeds
the number of buckets that it has then
an attempt will be made to resize the table.
* There is a limit of TBL_MAX_BUCKETS placed on
the number of buckets of a table.
* If this limit is exceeded keep using the existing table.
This allows a table to hold more than TBL_MAX_BUCKETS
entries at the expense of increased hash collisions.
Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Jesse Gross <jesse@nicira.com>
Ben Pfaff [Thu, 28 Jul 2011 17:19:42 +0000 (10:19 -0700)]
vlog: Add a new log level "off".
Until now, "emer" has effectively been "off" because no messages were ever
logged at "emer" level. Justin points out that it is useful to use "emer"
for messages that indicate a fatal error. This commit makes that change
and adds a new "off" level to really turn off all logging to a facility.
flex_arrays didn't exist at all until 2.6.30, weren't exported to modules
until 2.6.38, and performed poorly until 3.0, so this backports the
functionality to older kernels.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
datapath: Don't pass __GFP_ZERO to kmalloc on older kernels.
On new kernels kzalloc() is simply a wrapper around kmalloc with
the addition of the __GFP_ZERO flag. flex_arrays take advantage
of this by expecting the user to just pass in this flag if they
want the memory to be zeroed. However, before 2.6.23, kzalloc()
was a function in its own right and kmalloc really didn't like
receiving __GFP_ZERO. This overrides kmalloc() to intercept the
flags and direct the call to the right function.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Simon Horman [Thu, 28 Jul 2011 23:38:50 +0000 (16:38 -0700)]
ofproto-dpif: Allow setting of flow eviction threshold
Allow setting the number of flows present in the flow hash
at which point eviction of entries from the kernel flow hash
will begin to occur.
The value may be set using a bridge's other-config column.
e.g.
ovs-vsctl set bridge br3 other-config:flow-eviction-threshold=10000
default is 1000, reflecting constant value previously used.
Increasing this value can result in reduced CPU usage and
packet loss in situations where the number of active flows
is significantly larger than 1000.
ODP_ACTION_ATTR_CONTROLLER in the kernel actually sends packets to
userspace, not the controller. To make it generic rename this action
to ODP_ACTION_ATTR_USERSPACE.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Ben Pfaff [Thu, 21 Jul 2011 21:44:54 +0000 (14:44 -0700)]
ovs-ofctl: Fix dump-ports and queue-stats commands.
These ovs-ofctl commands have been sending malformed stats requests since
commit 63f2140a553 "openflow: Make stats replies more like other OpenFlow
messages." This commit fixes the problem and adds basic unit tests that
should prevent similar regressions.
Ben Pfaff [Wed, 27 Jul 2011 21:56:03 +0000 (14:56 -0700)]
netlink-socket: Reduce nl_sock_recv() from 2 (or more) system calls to 1.
Until now, each attempt to receive a message from a Netlink socket has
taken at least two system calls, one to check the size of the message to
be received and a second one to delete the message from the socket buffer.
This commit switches to a new strategy that requires only one system call
per message received.
In my testing this increases the maximum flow setups per second by a little
over 10%.
Ben Pfaff [Tue, 26 Jul 2011 16:46:38 +0000 (09:46 -0700)]
debian: Move OVSDB schema and ovsdb-tool to openvswitch-switch.
ovs-vswitchd in the openvswitch-switch package is tightly coupled to its
database schema. During development, it's possible to change the schema
without changing the Open vSwitch version number, which makes it possible
for the openvswitch-switch and openvswitch-common packages to get out of
sync: openvswitch-switch requires the same version of openvswitch-common,
but if the version number doesn't get updated that has no effect.
Actually putting the schema and ovs-vswitchd (its primary user) in the
same package prevents them from getting out-of-sync. This commit also
moves ovsdb-tool because that program often works directly with OVSDB
schemas and so there's not much point having it around without a schema to
work with.
Ben Pfaff [Fri, 22 Jul 2011 22:31:32 +0000 (15:31 -0700)]
ofproto-dpif: Restore former NORMAL action behavior when revalidating.
Before commit fa066f015 "bridge: Move packet processing functionality into
ofproto," the code that called into what became xlate_normal() prevented
a flow from being installed if it was called at revalidation time and
the MAC learning table lacked an entry for the flow's destination MAC.
That commit instead started dropping packets in that case (because it
incorrectly ignored xlate_normal()'s return value).
This restores the former behavior. It's not clear that the former behavior
is the best possible, but it is strictly better than starting to drop
packets at revalidation time.
Along with the previously fixed problem where flood_vlans were interpreted
incorrectly, this bug broke controller connectivity when flood_vlans was
set to any nonempty value that did not include the VLAN used for the
controller connection (that is, when flood_vlans was interpreted as
flooding the controller VLAN).
Ben Pfaff [Fri, 22 Jul 2011 00:03:03 +0000 (17:03 -0700)]
mac-learning: Fix inverted logic in is_learning_vlan().
When a bit is set in flood_vlans, that VLAN must be flooded, but the logic
here was reversed in the case where there were any flooded VLANs at all.
Thus, if a single VLAN was configured to be flooded, all VLANs except that
one were actually flooded.
The common case where no VLANs were to be flooded was handled correctly.