Justin Pettit [Tue, 17 Nov 2009 01:58:26 +0000 (17:58 -0800)]
ovs-ofctl: Add support for transport and network modification actions
Add support to ovs-ofctl for modifying the network source and destination
IP address with the "mod_nw_src" and "mod_nw_dst" actions, respectively.
And support modifying the TCP/UDP source and destination ports with the
"mod_tp_src" and "mod_tp_dst" actions, respectively.
Justin Pettit [Tue, 17 Nov 2009 02:08:12 +0000 (18:08 -0800)]
ofproto: Support missing set_nw_dst and set_tp_dst translations
The function that translates OpenFlow actions into datapath actions was
missing definitions for OFPAT_SET_NW_DST and OFPAT_SET_TP_DST. This
meant those actions would not occur in the datapath.
Justin Pettit [Tue, 17 Nov 2009 01:51:31 +0000 (17:51 -0800)]
datapath: Calculate proper checksum for set_tp_src/dst action
When the set_tp_src or set_tp_dst action is used, the calculation for
where the checksum is located was wrong. This caused the checksum to
not be updated and packet corruption in the bad offset.
Justin Pettit [Tue, 17 Nov 2009 00:36:21 +0000 (16:36 -0800)]
ovs-appctl: Fix shadow variable that could cause segfault
The variable "socket_name" contains the name of the unix domain socket
to be used for communicating with the OVS process. If the target does
not begin with a "/", the socket name is determined based on a pidfile.
A shadow copy of "socket_name" was kept in the block that looks at the
pidfile, which would cause the function-level one to not be set. This
removes that shadow copy.
Justin Pettit [Sat, 14 Nov 2009 02:53:28 +0000 (18:53 -0800)]
ofproto: Update time of super-rule to match sub-rule
Rules keep track of their creation and last used time. When a sub-rule
is updated, it wasn't updating the time of its super-rule. This commit
fixes that behavior.
Thanks to Jesse Gross for the help tracking this down.
Justin Pettit [Fri, 13 Nov 2009 23:51:44 +0000 (15:51 -0800)]
ovs-openflowd: Setup default listener
By default, ovs-openflowd was not listening for any management
connections. Tools such as ovs-ofctl attempt to use a default location
based on the datapath name. This change creates that default listener.
Jesse Gross [Sat, 7 Nov 2009 01:13:51 +0000 (17:13 -0800)]
mirroring: Allow learning to be disabled on a VLAN.
RSPAN does not work properly unless MAC learning for the VLAN is
disabled on all switches between the origin and monitoring point.
This allows learning to be disabled on a given VLAN so vSwitch can
acts as an intermediate switch.
Jesse Gross [Mon, 9 Nov 2009 23:26:51 +0000 (15:26 -0800)]
bridge: Require learning table at all times.
The bridge nominally allowed the MAC learning module to not be enabled
though in reality it was always used. Tracking active MAC addresses
in the bridge is useful for other reasons besides deciding the output
port - primarily for bonding. In addition there were several bugs
that would have been triggered had learning actually been disabled since
that code path is never tested. This makes it explicit that the learning
table should be maintained at all times.
Justin Pettit [Tue, 10 Nov 2009 00:06:52 +0000 (16:06 -0800)]
vconn: Clean-up "match" typo in comments
A few comments referenced "m", when "match" was clearly meant. This was
likely due to a quick search and replace that scooped up these comments
along with the intended code. This cleans that up.
Ben Pfaff [Mon, 9 Nov 2009 22:46:38 +0000 (14:46 -0800)]
Make ovs-appctl easier to use and synchronize its interface with ovs-vsctl.
It is inconvenient to type the whole path to the Unix daemon socket when
using ovs-appctl. Allow the name of the daemon to be used instead when
a pidfile exists in the default location, and contact ovs-vswitchd by
default.
Also, the various options for manipulating vlog were invented before the
general-purpose command mechanism existed. Get rid of all of the action
options in favor of just specifying the command to be executed as
non-option arguments.
Finally, there simply wasn't much value in allowing multiple targets or
options to be specified; these variations were never used in practice. So
simplify the interface by making it one target, one action per invocation.
Also, make ovs-vsctl use the same syntax for its --target option.
Jesse Gross [Wed, 4 Nov 2009 21:48:41 +0000 (13:48 -0800)]
bonding: Ignore updelay if there is no active slave.
If all slaves on a bond are down but some are waiting for an updelay,
enable the slave with the shortest amount of delay remaining. This
would already occur if all other slaves were disabled at the time the
delay was to begin but not if a delay was already in progress. This
also immediately sends learning packets out in both situations, which
prevents incoming packets to disabled slaves from being blackholed.
Ben Pfaff [Fri, 6 Nov 2009 18:25:50 +0000 (10:25 -0800)]
backtrace: Avoid GCC warning on x86-64.
The portable implementation of stack_low(), which before this commit is
used on x86-64, provokes a warning from GCC that cannot be disabled. We
already have an i386-specific implementation that does not warn; this
commit adds a corresponding implementation for x86-64 to avoid the warning
there too.
Ben Pfaff [Fri, 6 Nov 2009 18:22:55 +0000 (10:22 -0800)]
backtrace: Suppress dumb GCC warning on x86-64.
Without this change GCC warns "use of assignment suppression and length
modifier together in scanf format", which doesn't actually point out any
real problem (and why would it? Google turns up nothing interesting).
Jesse Gross [Fri, 6 Nov 2009 22:18:58 +0000 (14:18 -0800)]
udatapath: Implement ZERO_TCP_FLAGS option.
An option to zero the TCP flags when querying flow stats was added
to the kernel datapath to support NetFlow active timeouts. This
adds that same support to the user datapath.
Jesse Gross [Fri, 6 Nov 2009 21:26:42 +0000 (13:26 -0800)]
netflow: Only query stats of installed flows.
NetFlow active timeouts was querying the stats of all exact match
flows that had reached a certain age including those that could
not be installed. This was not harmful but it was wasteful and
produced log spew. This changes it to only query the flows that
are actually installed.
Jesse Gross [Wed, 28 Oct 2009 21:36:52 +0000 (14:36 -0700)]
datapath: Allow TCP flags to be cleared.
When querying flow stats allow the TCP flags to be reset. Since
the datapath ORs together all flags that have previously been
seen it is otherwise impossible to determine the set of flags from
after a particular time.
This commit forced the user to specify an action when deleting a flow,
which is not desirable. The change was not actually needed, as the
buffer is never passed to str_to_flow() in the original code.
Justin Pettit [Wed, 4 Nov 2009 05:19:47 +0000 (21:19 -0800)]
xenserver: Fix issue with deleting network UUID on VLAN destruction
In XenServer, a VLAN is considered an additional network with its own
UUID. The interface-reconfigure script properly adds this network UUID
to the configuration script, but commit 774428 removed the code that
would remove this information on VLAN destruction. Ian Campbell was the
author of that commit and felt that reverting this part was safe.
Justin Pettit [Tue, 3 Nov 2009 21:56:46 +0000 (13:56 -0800)]
Mention running boot.sh when pulling sources from Git
When the sources are pulled directly from Git, it is necessary to run
"./boot.sh" before "./configure" can be run. This commit documents that
useful bit of information.
Jesse Gross [Wed, 28 Oct 2009 21:36:52 +0000 (14:36 -0700)]
datapath: Allow TCP flags to be cleared.
When querying flow stats allow the TCP flags to be reset. Since
the datapath ORs together all flags that have previously been
seen it is otherwise impossible to determine the set of flags from
after a particular time.
Jesse Gross [Wed, 28 Oct 2009 23:05:57 +0000 (16:05 -0700)]
ofproto: Only zero stats for non exact-match sub-rules.
We zero the stats on sub-rules after they expire to prevent them
from being counted twice in their super-rule if they are reinstalled.
However, for exact-match sub-rules this means that the OpenFlow stats
are always zero. This changes that to only zero the stats for
non exact match rules.
Reid Price [Fri, 30 Oct 2009 19:39:14 +0000 (12:39 -0700)]
dump-vif-details: Safeguard 'finally' code
This makes several minor streamlining changes to dump-vif-details,
and moves the try statement in dump_vif_info to exclude session
initialization, so that finally will not obscure the original exception
with a new exception related to the session variable when logins fail.
Justin Pettit [Mon, 26 Oct 2009 19:02:02 +0000 (12:02 -0700)]
flow: Differentiate between "port" when printing flows
When printing a flow, there were two references to "port": one the
interface the packet arrived on and the other the L4 ports. This could
be a bit confusing to new users looking at the output of a command such
as "ovs-ofctl dump-flows". This commit changes the incoming interface
field from "port" to "in_port".
Jesse Gross [Thu, 22 Oct 2009 18:40:04 +0000 (11:40 -0700)]
bridge: Eject NORMAL flows without a learning entry from datapath.
When revalidating NORMAL flows we consult the learning table, which
could be empty if a packet hasn't come to userspace in a while or we
just did a bridge flush. If there is no learning entry then existing
flows will begin flooding packets until a new flow is setup. The
problem is worse with bonding because we can receive one of the flooded
packets back on a bond slave and learn that port, causing us to send
traffic to the wrong location.
Jesse Gross [Wed, 21 Oct 2009 02:26:55 +0000 (19:26 -0700)]
bonding: Balance bond slaves based on ratio.
Previously when deciding whether to migrate a hash between slaves
we would never move it if it would cause more load on the new slave
than the old. This could lead to a situation where the slaves would
be imbalanced but no migration would occur since it would flip the
load. This will do the migration if it will decrease the ratio.
Jesse Gross [Mon, 12 Oct 2009 20:09:51 +0000 (13:09 -0700)]
bonding: Drop unicast packets which have a different learned port.
Drop packets received on a bond port if we have learned a different
source port for that MAC. We were already doing this for multicast
packets but extend the logic to unicast packets as well since the
same situation can occur if the connected switch has not learned the
MAC address and is flooding. Otherwise vSwitch will learn the bond
port as the source of that MAC.
Jesse Gross [Tue, 20 Oct 2009 03:14:31 +0000 (20:14 -0700)]
xen: Restore state files for VIF VLANs
A change on master to use ovs-vsctl instead of state files for VLANs
was ported to the citrix branch, which does not have ovs-vsctl. The
interface reconfigure portion, which does not store the state files
was ported but the vif-hotput script portion was not. This restores
interface reconfigure to again save the state files.
Jesse Gross [Tue, 20 Oct 2009 01:40:11 +0000 (18:40 -0700)]
xen: Correctly write VLAN key in config file.
When adding the VLAN key the name of the vif was from a variable in
use on only the xs5.7 branch. This uses the correct variable name
for the master branch
Ben Pfaff [Fri, 16 Oct 2009 16:36:25 +0000 (09:36 -0700)]
ovs-vsctl: Refactor internals to increase flexibility.
This changes the interface of each of the command implementations, making
them take the configuration as an argument and return the output. This
will make it easier to support alternate output formats and to execute more
than one command per invocation (both happening in upcoming commits).
Ben Pfaff [Thu, 15 Oct 2009 19:47:05 +0000 (12:47 -0700)]
ovs-vsctl: Allow bridge name to be omitted from del-port command.
The 'bridge' argument to ovs-vsctl's del-port command is only supplied as
a form of error checking. Sometimes the name of the bridge isn't readily
available, so for such situations this commit allows the user to omit the
name of the bridge entirely.
Ben Pfaff [Fri, 16 Oct 2009 16:26:20 +0000 (09:26 -0700)]
ovs-vsctl: Log changes to configuration file to syslog.
This feature, which has been in ovs-cfg-mod for some time as the "-c"
option, makes it much easier to see what changes ovs-vsctl actually makes
to ovs-vswitchd.conf.
Ben Pfaff [Thu, 15 Oct 2009 17:39:10 +0000 (10:39 -0700)]
Make sure that time advances in a daemon between calls to time_refresh().
Open vSwitch uses an interval timer signal to tell it that its cached idea
of the current time has expired. However, this didn't work in a daemon
detached from the foreground session (invoked with --detach) because a
child created with fork() does not inherit the parent's interval timer and
we did not re-set it after calling fork().
This commit fixes the problem by setting the interval timer back up after
calling fork() from daemonize().
This fix is based on code inspection (which was then verified to be correct
through testing). It may not fix any actual problems in practice, because
time_refresh() is called every time through the poll loop, and the poll
loop typically runs more quickly than the periodic timer fires (1 ms or so
average in ovs-vswitchd, vs. 100 ms timer interval).
Ben Pfaff [Wed, 24 Jun 2009 21:58:57 +0000 (14:58 -0700)]
datapath: Ignore return value from rtnl_notify().
In Linux 2.6.30, the rtnl_notify() return type was changed from int to
void along with the following commit message:
This patch also modifies the rtnetlink code to ignore the return
value of rtnl_notify() in all callers. The function rtnl_notify()
(before this patch) returned the error of the unicast notification
which makes rtnl_set_sk_err() reports errors to all listeners. This
is not of any help since the origin of the change (the socket that
requested the echoing) notices the ENOBUFS error if the notification
fails and should resync itself.
Thus there's no point in checking the return value, even in older versions
of the kernel, and so this commit changes our code to ignore it, even
on older kernel versions. We also update the rtnl_notify() wrapper macros
to make the return type void on older kernel versions.
This has not been tested, just built.
Thanks to Mikio for spurring me to try building with Linux 2.6.29 and
2.6.30.
Jesse Gross [Thu, 8 Oct 2009 19:31:03 +0000 (12:31 -0700)]
vlan: Compare vlan tags before implicit tagging when RSPANing.
We check that a packet is not sent out the on the in port on the
same VLAN when performing RSPAN. However, we were comparing the
vlan tag from a packet after implicit tagging with a tag from
before implicit tagging. This ensures that we always compare them
before such tagging.
Ben Pfaff [Fri, 9 Oct 2009 16:41:29 +0000 (09:41 -0700)]
datapath: Fix build with Centos 5.3 kernel.
Centos 5.3 backports more functions from later kernel versions to 2.6.18,
so the kernel version number is no longer a reliable way to check for these
functions. Thus, add a "configure" test for them.
Ben Pfaff [Wed, 7 Oct 2009 19:07:27 +0000 (12:07 -0700)]
xenserver: Crossport "master" interface-reconfigure to "citrix".
This commit copies the interface-reconfigure script from "master" into
"citrix" and fixes up a few incompatibilities: the location of ovs-cfg-mod,
which in master is in /usr/bin and in citrix is in /root/vswitch/bin, and
the RPM spec file fragments needed to initialize the database cache.
The purpose of this commit is to obtain the bug fixes that have been
applied (mainly by Ian Campbell) to "master" but which are not in "citrix".
It's difficult to understand the changes from this commit alone. It is
more meaningful to compare the resulting files against those currently
on the master branch.
Jesse Gross [Fri, 14 Aug 2009 20:47:28 +0000 (13:47 -0700)]
ofproto: Make current packet counts more accurate.
When the stats for a currently active flow are requested this
ensures that the packets not handled by the kernel are counted
immediately. Before, these packets would only be counted once
the kernel flow expired and the counts were combined.
Jesse Gross [Thu, 8 Oct 2009 19:20:10 +0000 (12:20 -0700)]
bonding: Compare ports, not interfaces, for loop checks.
In order to avoid loops we check that the input and output ports
are not equal. When selecting mirror outputs for RSPAN we were
checking interfaces instead of ports. This lead to loops when
using RSPAN with bonded ports.
Ben Pfaff [Wed, 7 Oct 2009 17:19:31 +0000 (10:19 -0700)]
xenserver: Fix ovs-vsctl in built RPM by defining /etc as sysconfdir.
By default, the "configure" script picks a sysconfdir of $prefix/etc,
which works out to /usr/etc in our case. That's wrong, of course--it
should be /etc--but we didn't notice until now because sysconfdir was
only used in ovs-vsctl, which in turn wasn't used at all on a XenServer
system until recently.
This bug is present on all branches, but it is only potentially visible
on "master" and "xs5.7", since only those have ovs-vsctl. It is only
actually visible on "xs5.7", since that is the only branch where the
system uses ovs-vsctl itself (from /etc/xensource/scripts/vif), but this
is being committed to master in case we start using ovs-vsctl there too.
Jesse Gross [Mon, 5 Oct 2009 20:25:19 +0000 (13:25 -0700)]
netflow: Increase maximum number of NetFlow records to 30.
NetFlow v5 allows up to 30 records per packet but we were incorrectly
limiting to 29. This corrects that and also uses the count of the
number of records in the header rather than the packet size since
it is easier to reason about.
Jesse Gross [Sat, 3 Oct 2009 01:08:05 +0000 (18:08 -0700)]
bonding: Update the link status on the bond fake interface.
Brings the fake bond interface up and down to match our notion of
whether the bond is currently active. This solves an issue where
XenCenter would always show the bond as disconnected.
Ian Campbell [Mon, 5 Oct 2009 15:32:53 +0000 (16:32 +0100)]
xenserver: use ovs-vsctl for VIF VLANs instead of separate state files
ovs-vsctl did not exist when this code was originally written, but it
provides exactly what is needed to get rid of those separate state
files.
The vif hotplug script diff is against the xs5.7 branch but I think is
applicable to master and/or citrix with just context changes.
I was thinking of using ovs-vsctl exclusively for configuration
modifications from the vif hotplug script but that would need a
mechanism to pass the additional vif details to ovs-vsctl add-port as
well as perhaps making the bridge optional to del-port. The other option
would be to use the --no-reload option and split the config mods into
two parts, but I don't like that idea much.
Ian Campbell [Mon, 5 Oct 2009 15:27:01 +0000 (16:27 +0100)]
xenserver: Include bridge.*.xs-network-uuids for all networks
Previously I advised that only networks which were currently attached to
the host be listed in /etc/ovs-vswitchd.conf. However I've just realised
that this interacts badly with the slightly ugly special case used for
PIF.currently-attached when reading from dbcache instead of talking to
Xapi. This bites on boot when /etc/init.d/management-interface tries to
plug a selection of PIFs which are deemed to be somehow required by
xapi. (not helped by a bug in XenServer 5.7.0 which can cause this list
to be larger than it should be and not internally consistent).
For now I think it prudent to simply list all networks which could
potentially be attached to a given datapath, until I can figure out what
the sane fix is on the XenServer end.
(I think there are two options for a proper fix, either inspect the
current state of the network devices or assume dbcache represents the
desired final state after devices are plugged on boot. I'm leaning
towards the later since the dbcache should indicate the set of PIFs
which were attached on shutdown, which xapi will likely be trying to
replug on boot... Needs more thought though).
Ben Pfaff [Fri, 2 Oct 2009 20:29:01 +0000 (13:29 -0700)]
vswitch: Allow user to set Ethernet address of any internal interface.
Until now the vswitch configuration file has allowed the user to configure
the MAC address on bridge local ports only. This commit adds the ability
to configure them on any internal interface.
It would be logical to extend this to any bridge port, period, but many
network devices must be brought down before their Ethernet addresses may be
changed. Bringing a network interface down and then back up can reset a
lot of state, so as we don't actually need the ability to change any bridge
port's MAC address yet this commit does not implement it.
Justin Pettit [Fri, 2 Oct 2009 22:20:12 +0000 (15:20 -0700)]
dpif-linux: Fail earlier if OVS kernel module isn't loaded
When the kernel module isn't loaded, the bridge tries to open all the
possible minor devices, regardless. This change first checks that there
is a major device number for Open vSwitch and only then tries to open the
minor devices.
This change also removes the assumption that there's a default Open vSwitch
major device number, since the kernel module always attempts to get a
dynamic one. Maybe one day we'll have one...
Fixes a bug whereby netdev_linux_set_etheraddr() would update the cached
Ethernet address but not mark it valid. (This potentially wasted a system
call later but wasn't harmful.)
As an added optimization, don't set the Ethernet address at all if the
new address is the same as the current address.
Jesse Gross [Fri, 2 Oct 2009 17:31:20 +0000 (10:31 -0700)]
netdev-linux: Return correct error codes on receive.
netdev_linux_receive was returning positive error codes while the
interface specifies that it should be returning negative errors.
This difference causes a huge increase in (non-existant) packet
processing with the userspace datapath.
Ben Pfaff [Mon, 28 Sep 2009 23:03:49 +0000 (16:03 -0700)]
xenserver: Add script refresh-xs-network-uuids.
On pool join, the bridge.<bridge>.xs-network-uuids key is not updated
properly for the primary management interface. We don't have a proper
fix for this problem yet, and probably won't ever have one for XenServer
5.5.0, so this commit adds a script that works around the problem.
Running the script is a shortcut for rebooting the XenServer host,
which should also solve the problem.
Ben Pfaff [Mon, 28 Sep 2009 17:15:22 +0000 (10:15 -0700)]
debian: Make dependencies on openvswitch packages specify exact version.
NOX packages depend on a particular version of openvswitch-pki, which
depends on openvswitch-common without specifying a version. This meant
that the installed versions of openvswitch-pki and openvswitch-common
could easily get out of sync. This commit makes all of the dependencies
among openvswitch packages specify an explicit version, which should fix
this problem.