Ethan Jackson [Fri, 7 Oct 2011 05:43:05 +0000 (22:43 -0700)]
cfm: New 'cfm_opstate' setting.
In some cases, a controller may want to take an interface down for
forwarding purposes, but avoid completely deconfiguring CFM and
thus lose all connectivity monitoring. The new 'cfm_opstate'
setting is a way to achieve this behavior.
OVS_ACTION_ATTR_SAMPLE has never been implemented in dpif-netdev. This
commit implements it and adds a cast to enum ovs_action_type in the switch
statement that checks the action type, so that GCC complains if we forget
to add a case for a new action type.
I had to assign the return value of nl_attr_type() to a temporary variable,
because "switch ((enum ovs_action_type) nl_attr_type(a))" provoked a GCC
warning that I've never seen before:
../lib/dpif-netdev.c:1260: warning: cast from function call of type 'int'
to non-matching type 'enum ovs_action_type'
Ben Pfaff [Wed, 5 Oct 2011 16:04:50 +0000 (09:04 -0700)]
dpif-netdev: Simplify code by removing dpif_netdev_validate_actions().
dpif_netdev_validate_actions() existed for three reasons. First, it checked
that the actions were well-formed and valid. This isn't really necessary,
because the actions are built internally by ofproto-dpif and will always be
well-formed. (If not, that's a bug in ofproto-dpif.) Second, it checks
whether the actions will modify (mutate) the data in the packet and reports
that to the caller, which can use it to optimize what it does. However,
the only caller that used this was dpif_netdev_execute(), which is not a
fast-path (if dpif-netdev can be said to have a fast path at all).
Third, dpif_netdev_validate_actions() rejects certain actions that
dpif-netdev does not implement: OVS_ACTION_ATTR_SET_TUNNEL,
OVS_ACTION_ATTR_SET_PRIORITY, and OVS_ACTION_ATTR_POP_PRIORITY. However,
this doesn't really seem necessary to me. First, dpif-netdev can't support
tunnels in any case, so OVS_ACTION_ATTR_SET_TUNNEL shouldn't come up.
Second, the priority actions just aren't important enough to worry about;
they only affect QoS, which isn't really important with dpif-netdev since
it's going to be slow anyway.
So this commit just drops dpif_netdev_validate_actions() entirely.
Ben Pfaff [Wed, 5 Oct 2011 18:06:12 +0000 (11:06 -0700)]
openflow: Delete icmp_type and icmp_code macros.
These macros caused trouble if datapath-protocol.h was included before
openflow.h. Later references to the icmp_type and icmp_code members of
struct ovs_key_icmp caused compiler errors, because the macros caused them
to try to refer to nonexistent tp_src and tp_dst members in those
structures.
Ben Pfaff [Fri, 7 Oct 2011 22:43:43 +0000 (15:43 -0700)]
cfm: Send properly formatted CCMs.
Wireshark complained that Open vSwitch-generated CFM messages were
malformed. Upon looking at the standard, I spotted that Open vSwitch
failed to include the final, required "End TLV" byte with value 0.
This commit adds the End TLV byte to generated CCMs but still accepts
the truncated messages for backward compatibility.
Commit b063d9f0 "datapath: Use unicast Netlink sockets for upcalls" that
introduced an 'upcall_pid' member into struct dpif_linux_vport, struct
dpif_linux_dp, and struct dpif_linux_flow neglected to do so only if the
member was nonzero. This caused every datapath, vport, and flow operation
to supply an upcall_pid. In particular, the netdev_set_config() called at
startup when a vport already existed caused the upcall_pid for that vport
to be reset to 0, which in turn caused all packets received on the vport to
be dropped instead of forwarded to ovs-vswitchd.
Reported-by: Shih-Hao Li <shli@nicira.com>
Bug #7714.
Pravin B Shelar [Fri, 7 Oct 2011 04:52:39 +0000 (21:52 -0700)]
datapath: Remove RT kernel support.
Following patch removes RT kernel support. This allows us to cleanup
the loop detection.
Along with this BH is now disabled while running execute_actions()
for packet from user-space.
As a result we can simplify the stats code as entire send and receive
path runs in BH context on all supported platforms.
Pravin B Shelar [Fri, 7 Oct 2011 02:45:09 +0000 (19:45 -0700)]
datapath: Fix recv path for CONFIG_PREEMPT_RCU.
In case CONFIG_PREEMPT_RCU, rcu grace period waits only for RCU
read-side critical sections that are delimited by rcu_read_lock() and
rcu_read_unlock(). internal_dev_xmit() is called in
rcu_read_lock_bh context. Therefore we need to explicitly take rcu
lock to prevent race with call_rcu() in PREEMPT_RCU case.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
Ethan Jackson [Tue, 4 Oct 2011 22:47:07 +0000 (15:47 -0700)]
debian: Fully convert ovs-monitor-ipsec to vlog.
Commit 201bf205 "ovs-monitor-ipsec: Convert to vlog." only
partially updated ovs-monitor-ipsec to the new vlog module. This
commit completes the process.
Ben Pfaff [Wed, 5 Oct 2011 21:55:02 +0000 (14:55 -0700)]
debian: Package the installed Python files, not those from the source tree.
The only difference between the Python files that are installed and the
Python files found in the source tree is in the ovs.dirs module, but this
is a very important difference: we want the directories used to be the ones
configured in (e.g. /usr/share/openvswitch), not the only used by default
by the source tree's dirs.py (e.g. /usr/local/share/openvswitch).
I verified with "dpkg-deb -x" and "diff -ur" that in fact this is the only
change that this commit makes.
This bug has been in place since at least commit 1d273d6d8 "debian: Rename
openvswitch-python to python-openvswitch" from over a year ago, but until
now the packaged Python files didn't actually use any directories that
differed between the two versions of dirs.py, so only now has the problem
manifested.
This problem prevented ovs-monitor-ipsec from finding the OVSDB schema
file.
Ben Pfaff [Wed, 5 Oct 2011 18:18:13 +0000 (11:18 -0700)]
dpif: Avoid use of "struct ovs_dp_stats" in platform-independent modules.
Over time we wish to reduce the number of datapath-protocol.h definitions
used directly outside of Linux-specific code. This commit removes use of
"struct ovs_dp_stats" from platform-independent code.
Pravin B Shelar [Wed, 5 Oct 2011 00:48:33 +0000 (17:48 -0700)]
datapath: Fix tunnel hashing.
struct port_lookup_key is getting extra 4 bytes due to alignment on x86_64.
That is messing with hash calculation which uses sizeof operator to
calculate key length. Following patch fixes tunnel hashing by using correct
key length.
Justin Pettit [Tue, 4 Oct 2011 18:49:22 +0000 (11:49 -0700)]
nicira-ext: Fix build problems on 64-bit systems.
Commit d2c0fe (nicira-ext: Bump number of registers to five from four.)
broke the build on 64-bit systems. This commit fixes the problems it
introduced.
Ben Pfaff [Wed, 21 Sep 2011 17:07:11 +0000 (10:07 -0700)]
vswitchd: Document map members as separate columns
The OVS configuration database now has numerous columns that contain fixed
key-value pairs. Currently there's no way to see these at a glance,
because they are not presented in the summary tables just before the
detailed descriptions.
This commit extends the XML format so that keys within a column can be
described individually, and rearranges and rewrites vswitch.xml to take
advantage of this feature.
Ben Pfaff [Tue, 4 Oct 2011 16:26:14 +0000 (09:26 -0700)]
check-structs: Add check that OFP_ASSERT is checking the right structures.
This avoids a fairly common issue in which a developer cuts and pastes a
structure definition and forgets to update the structure name inside the
OFP_ASSERT, so that the new structure's size doesn't really get checked at
all.
Ethan Jackson [Mon, 3 Oct 2011 19:51:33 +0000 (12:51 -0700)]
ofproto-dpif: LACP registration should cause revalidation.
Whenever a slave is registered to participate in LACP, it needs to
be revalidated so that it can receive LACP PDUs. This bug can only
surface in an edge case where a pre-existing interface is added to
a pre-existing bond. It would be unusual for a controller to do
this.
Ethan Jackson [Mon, 3 Oct 2011 19:51:02 +0000 (12:51 -0700)]
ofproto-dpif: Revalidate on port additions and deletions.
The addition of a new port to an ofproto-dpif may require
revalidations in some cases. Notably if this new port is
configured to participate in CFM, but a drop flow has already been
installed in the datapath for CFM messages with the same in_port.
Ben Pfaff [Mon, 3 Oct 2011 16:19:29 +0000 (09:19 -0700)]
debian: Make python-openvswitch packaging work with squeeze dh_python2.
The dh_python2 helper in Debian squeeze has a limitation that is not
mentioned anywhere, as far as I can tell: Python files must be in
/usr/lib/python#.#/site-packages to be installed. The version in Debian
wheezy does not have the same limitation.
This meant that building the Debian packages on squeeze silently produced
a broken python-openvswitch package, whereas building the same thing on
wheezy built a working package.
This fixes the problem by putting the .py files where squeeze expects them.
It works on wheezy too.
Bug #7510. Reported-by: Michael Hu <mhu@nicira.com> Tested-by: Simon Horman <horms@verge.net.au>
We define some constants for dealing with vlan PCP bits since at
the time they didn't exist upstream. They've since been merged
upstream with different names and we don't use them anyways, so
just drop them.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Thu, 29 Sep 2011 22:36:14 +0000 (15:36 -0700)]
flow: Move flow_extract_stats() to dpif.c, as dpif_flow_stats_extract().
The "flow" module is concerned only with OpenFlow flows these days. It
shouldn't have anything to do with ODP or dpifs. However, it included
dpif.h just to implement flow_extract_stats(). This function is a better
fit for dpif.c, so this commit moves it there and removes the dpif.h
#include from flow.h and flow.c
This commit also removes a few more dpif.h #includes that weren't needed.
Ben Pfaff [Thu, 29 Sep 2011 22:18:58 +0000 (15:18 -0700)]
Remove a pair of unnecessary dependencies on datapath-protocol.h.
These headers don't really need datapath-protocol.h. connmgr.h indirectly
used "struct nlattr" from that header, so add a forward declaration. (The
next commit will remove use of struct nlattr entirely from that header,
since it is not really appropriate.)
cfm: Cleanup output of ovs-appctl "cfm/show" command.
When no remote MPIDs were found, the output would print an extra newline.
If multiple remote MPIDs were found, the lines would run together. This
commit cleans things up a bit by just printing each item on its own line
without any blank lines.
xenserver: Correct location of xen-bugtool plugins.
xen-bugtool plugins were stored in "/etc/xensource/bugtool/bugtool-plugins/*",
but xen-bugtool will not traverse deeper than "/etc/xensource/bugtool/*".
This commit corrects the location and makes our plugins run.
signed-off-by: Justin Pettit <jpettit@nicira.com>
acked-by: Ben Pfaff <blp@nicira.com>
in-band: Classifier rules should use OpenFlow ports.
The in-band rules want to allow traffic to and from the local port
but the rule to be inserted in the classifier table for DHCP traffic
used the datapath constant for local traffic instead of the OpenFlow
one.
make_packet_out() operates solely at the OpenFlow layer, so it
should never receive OVSP_LOCAL as an in_port. This function is
used only by the learning switch, which will never generate
a packet with OVSP_LOCAL so this fixes not a bug but a violation
of semantics.
datapath-protocol.h is increasingly a Linux-specific header so its
usage should be restricted from generic code. Nothing in flow.c
uses it, so drop it to avoid layer violations.
Ben Pfaff [Thu, 29 Sep 2011 16:49:37 +0000 (09:49 -0700)]
datapath: Avoid use-after-free error in dp_device_event().
Commit f14d80834 "datapath: genl_notify() on port disappearances" frees the
vport before passing it to ovs_vport_cmd_build_info(), which reads the
freed data.
Without this commit, the following commands consistently trigger a kernel
BUG report on my test VM (which has slab debugging enabled) on 3 attempts:
tunctl
ovs-vsctl add-port br0 tap0
tunctl -d tap0
With this commit, I consistently don't see the BUG, on a few hundred tries
in a tight loop.
The interesting log information is:
device tap0 entered promiscuous mode
device tap0 left promiscuous mode
BUG: unable to handle kernel paging request at 6b6b6ba7
IP: [<c88269ed>] get_vport_protected+0x8/0x52 [openvswitch_mod]
*pde = 00000000
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:04.0/net/eth1/carrier
Modules linked in: brcompat_mod openvswitch_mod
Ben Pfaff [Wed, 28 Sep 2011 16:36:13 +0000 (09:36 -0700)]
bridge: Clear out all Interface fields when an interface cannot be created.
When an Interface record is invalid (for example, when the interface that
it specifies does not exist and cannot be created), ovs-vswitchd would
leave any pre-existing data in its columns, except that it would set the
ofport column to -1 to indicate the error. This was sometimes confusing
because, for example, the lacp_current field could still be set to "true"
if LACP has previously been active and up-to-date.
This commit changes ovs-vswitchd to reset all such data to its default
values when an interface is invalid.
Bug #7450. Reported-by: Duffie Cooley <dcooley@nicira.com>
Bug #7491. Reported-by: Ethan Jackson <ethan@nicira.com>
Release Notes #7500. Reported-by: Keith Amidon <keith@nicira.com>
Ben Pfaff [Thu, 29 Sep 2011 06:07:11 +0000 (23:07 -0700)]
ovs.daemon: Fix semantics of --pidfile option.
The --pidfile option is supposed to work like this:
* Without --pidfile, you don't get a pidfile.
* With --pidfile, you get the default pidfile.
* With --pidfile=FILE, you get FILE as your pidfile.
However, it actually worked like this:
* Without --pidfile, you got the default pidfile.
* With --pidfile, you got no pidfile at all.
* With --pidfile=FILE, you got FILE as your pidfile.
This is because of the semantics of "default" in argparse. It is
documented as:
The default keyword argument of add_argument(), whose value defaults to
None, specifies what value should be used if the command-line argument
is not present. For optional arguments, the default value is used when
the option string was not present at the command line.
We actually want "const", which is documented under the description of
nargs="?" as:
If no command-line argument is present, the value from default will be
produced. Note that for optional arguments, there is an additional
case - the option string is present but not followed by a command-line
argument. In this case the value from const will be produced.
If the routing table is destroyed and re-created then it will
trigger another assertion because route_notifier is not NULL,
even though it has already been freed.
Following patch adds sampling action which takes probability and set
of actions as arguments. When probability is hit, actions are executed for
given packet.
USERSPACE action's userdata (u64) is used to store struct
user_action_cookie as cookie. CONTROLLER action is fixed accordingly.
Now we can remove sFlow code from kernel and implement sFlow generically
as SAMPLE action. sFlow is defined as SAMPLE Action with probability (sFlow
sampling rate) and USERSPACE action as argument. USERSPACE action's data
is used as cookie. sFlow uses this cookie to store output-port, number of
output ports and vlan-id. sample-pool is calculated by using vport
stats.
route-table: Close netlink notifier before closing nln.
Commit 2ee6545f2bff7eb27e8c84965e3ff38dfa909bf6 "notifiers: Create
and destroy nln_notifiers." requires callers to explicitly create
and destroy netlink notifiers but route-table only did the creation
part. This causes an assertion failure any time the netdev for a
vport is destroyed (for example ovs-dpctl show when there is a
tunnel port).
Ethan Jackson [Wed, 21 Sep 2011 22:43:27 +0000 (15:43 -0700)]
python: Create new vlog module.
Currently, each python daemon has to come up with it's own logging
solution. These logging strategies are not consistent across the
python code or with the C vlog module. This patch adds a new
logging module which hopes to solve the problem. This new module
generates log messages in a manner consistent with the C code.
Furthermore, it can easily be extended to support things like rate
limiters in the future.
This patch does not update any python code to use the new module.
Ethan Jackson [Thu, 22 Sep 2011 19:05:18 +0000 (12:05 -0700)]
python: Backport argparse to older platforms.
Argparse has some convenient advantages over optparse including the
ability to handle optional arguments to flags. It also supports
parsing arguments as well as options.
This patch copies argparse.py from Python 2.7 into a newly created
compat directory. It made some very minor syntactic updates in the
process. Platforms which have a Python version too old to include
argparse by default will have this compat version installed as a
workaround.
Ben Pfaff [Mon, 26 Sep 2011 20:15:54 +0000 (13:15 -0700)]
ofproto-dpif: Get rid of OFP_VLAN_NONE pollution.
OFP_VLAN_NONE used to be convenient as a value for struct dst's 'vlan'
member, because it ended up being used in actions anyway, but now it's
much better to just use 0.
Ben Pfaff [Sat, 24 Sep 2011 00:11:49 +0000 (17:11 -0700)]
ofproto-dpif: Fix VLAN and other field handling in OFPP_NORMAL.
compose_actions(), which is part of the OFPP_NORMAL implementation, had
multiple flaws that this commit corrects.
First, it did not commit changes made to the flow by actions preceding
the output to OFPP_NORMAL. This means that, for example, if an OpenFlow
action to modify an L2 or L3 header preceded the output to OFPP_NORMAL,
then OFPP_NORMAL would output its packets without those changes.
Second, it did not update the action_xlate_ctx's notion of the VLAN that
was currently set, in the cases where it modified the current VLAN. This
means that actions following the output to OFPP_NORMAL could output to
unexpected VLANs.
Third, when it switched to VLAN 0, it unconditionally also stripped the
whole 802.1Q header, so that if the packet originally was priority tagged,
the output packet was not priority tagged. This is reasonable behavior,
but it is a change in behavior from what previous releases of OVS did, so
this commit reverts it.
Based on a patch from and a conversation with Pravin Shelar
<pshelar@nicira.com>.
Ethan Jackson [Sat, 24 Sep 2011 06:43:12 +0000 (23:43 -0700)]
python: Style cleanup.
This patch does minor style cleanups to the code in the python and
tests directory. There's other code floating around that could use
similar treatment, but updating it is not convenient at the moment.
ovs-bugtool: Use RUNDIR macro for ovs-appctl target.
Correct the target path used by ovs-appctl. The previous value was
hard-coded to "/var/run", but the common path is actually
"/var/run/openvswitch". However, it's better to use RUNDIR, since the
default location is build-time configurable.
dpif-linux: Prevent a single port from monopolizing upcalls.
Currently it is possible for a client on a single port to generate
a huge number of packets that miss in the kernel flow table and
monopolize the userspace/kernel communication path. This
effectively DoS's the machine because no new flow setups can take
place. This adds some additional fairness by separating each upcall
type for each object in the datapath onto a separate socket, each
with its own queue. Userspace then reads round-robin from each
socket so other flow setups can still succeed.
Since the number of objects can potentially be large, we don't always
have a unique socket for each. Instead, we create 16 sockets and
spread the load around them in a round robin fashion. It's theoretically
possible to do better than this with some kind of active load balancing
scheme but this seems like a good place to start.
netlink: Expose version of nl_attr_find for key and len.
Many of our functions pass around a pointer to Netlink attributes
and a length. This exposes the version of nl_attr_find that takes
that format so it can be used by callers outside the Netlink library.
poll-loop: Enable checking whether a FD caused a wakeup.
Each time we run through the poll loop, we check all file descriptors
that we were waiting on to see if there is data available. However,
this requires a system call and poll already provides information on
which FDs caused the wakeup so it is inefficient as the number of
active FDs grows. This provides a way to check whether a given FD
has data.