Ben Pfaff [Wed, 21 Sep 2011 17:43:03 +0000 (10:43 -0700)]
python: Implement write support in Python IDL for OVSDB.
Until now, the Python bindings for OVSDB have not supported writing to the
database. Instead, writes had to be done with "ovs-vsctl" subprocesses.
This commit adds write support and brings the Python bindings in line with
the C bindings.
This commit deletes the Python-specific IDL tests in favor of using the
same tests as the C version of the IDL, which now pass with both
implementations.
This commit updates the two users of the Python IDL to use the new write
support. I tested this updates only by writing unit tests for them,
which appear in upcoming commits.
Ben Pfaff [Tue, 20 Sep 2011 18:24:44 +0000 (11:24 -0700)]
ovs.db.types: Add table reference to ovs.db.types.BaseType.
Until now ovs.db.types.BaseType has kept track of the name of the
referenced table but not a reference to it. This commit renames the
ref_table attribute to ref_table_name and adds a new ref_table attribute
whose value is a reference to the named table.
This will be useful in an upcoming commit where table references are
actually followed.
Ben Pfaff [Fri, 16 Sep 2011 00:17:36 +0000 (17:17 -0700)]
python: Accept multiple forms of strings and lists when parsing JSON.
The JSON parser in OVS always yields unicode strings and lists, never
non-unicode strings or tuples, but it's easy to create them when building
JSON elsewhere, so accept both forms.
Ben Pfaff [Thu, 22 Sep 2011 18:54:22 +0000 (11:54 -0700)]
netdev-linux: Fix broken build on RHEL 6.
Commit 00fa9d37c2b "Do not include net/ethernet.h and linux/if_tunnel.h"
introduced a compile error on RHEL 6:
lib/netdev-linux.c: In function 'netdev_linux_listen':
lib/netdev-linux.c:734: error: 'ETH_P_ALL' undeclared (first use in this
function)
This fixes the problem.
I verified that the Android NDK r6b mentioned in the previous commit
contains a file named android-ndk-r6b/platforms/android-3/arch-x86/use/
linux/if_ether.h that defines ETH_P_ALL. I didn't try building on that
platform.
Ben Pfaff [Thu, 22 Sep 2011 18:36:39 +0000 (11:36 -0700)]
netlink-socket: Async notifications are incompatible with other operations.
A Netlink socket that receives asynchronous notifications (e.g. from a
multicast group) cannot be used for transactions or dumps, because those
operations would discard asynchronous messages that arrive while waiting
for replies.
This commit documents this issue in a comment on nl_sock_join_mcgroup().
It also removes an internal attempt to avoid mixing multicast reception
with other operations. The attempt was incomplete, because it only
handled dumps even though ordinary transactions are also problematic. It
seems better to remove it than to fix it because, first, all of the
existing users in OVS already separate multicast reception from other
operations and, second, an upcoming commit will start using unicast
Netlink for asynchronous notifications, which has the same issues but
doesn't use nl_sock_join_mcgroup().
Ben Pfaff [Wed, 21 Sep 2011 21:56:55 +0000 (14:56 -0700)]
ovs-xapi-sync: Make pychecker-able.
pychecker imports the code that it checks, which means that code at top
level runs, so "ovs-xapi-sync" failed to import unless the user had write
access to /var/log/openvswitch.
Simon Horman [Thu, 22 Sep 2011 12:24:14 +0000 (21:24 +0900)]
Remove netdev_find_dev_by_in4
netdev_find_dev_by_in4() appears to no longer be used and thus
can be removed. This also allows netdev_enumerate(), the
enumerate member of struct netdev_class and netdev_linux_enumerate()
to be removed.
I noticed this as netdev_linux_enumerate() makes use of if_nameindex()
and if_freenameindex() which are not available when compiling using
the Android NDK r6b (Android API level 13).
datapath: IFF_BRIDGE_PORT is backported by Centos 5.6.
Some versions of Centos 5.6 backport the flag IFF_BRIDGE_PORT
without the associated rx_handler changes, so this changes to
use a version check since we really don't care about the actual
symbol.
datapath: Send to userspace errors shouldn't halt processing.
If we encounter an error when sending a packet to userspace due to
an explicit action we stop processing further actions. This makes
sense for things like push vlan, where to continue means outputting
an incorrect packet. However, sending to userspace is more akin
to outputting to a port, which does not halt further processing.
For consistency, ignore errors in this case as well.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
datapath: Correctly validate vport attributes on old kernels.
The vport policy for OVS_VPORT_ATTR_PORT_NO and OVS_VPORT_ATTR_TYPE
are present only in the section for newer kernels. This means that
on older kernels the length of these attributes are never checked
anywhere but we go ahead and read from them anyways.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
We never allow shared skbs to be present inside of the OVS datapath
but the presence of a check in the core makes this less clear. Since
the check is very old and no longer relevant, drop it.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
datapath: Fully initialize datapath before local port.
It's possible to start receiving packets on a datapath as soon as
the internal device is created. It's therefore important that the
datapath be fully initialized before this, which it currently isn't.
In particular, the fact that dp->stats_percpu is not yet set is
potentially fatal. In addition, if allocation of the Netlink response
failed it would leak the percpu memory. This fixes both problems.
Found by code inspection, in practice the datapath is probably always
done initializing before someone can send a packet on it.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
datapath: Correctly set error code in queue_userspace_packets().
In a few places in queue_userspace_packets() when we encounter an
error, we don't actually set the 'err' variable. Although we
free the packets we don't correctly account for these packets as
being lost.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
datapath-protocol: vport_stats types are unsigned.
The 'u' in uint64_t apparently got clipped off of the tx_dropped
member of struct vport_stats in between review and push, incorrectly
making this a signed type.
The nl_lookup_genl_mcgroup() function can fail on older kernels
which do not support the required netlink interface. Before this
patch, dpif-linux would refuse to create a datapath when this
happened. With this patch, it attempts to use a workaround. If
the workaround fails it simply disables the affected features
without completely disabling the dpif.
Ethan Jackson [Wed, 14 Sep 2011 18:26:21 +0000 (11:26 -0700)]
dpif-linux: Open dpif despite notifier failures.
Before this patch, if dpif-linux failed to register a notifier it
would give up opening the datapath entirely. This seems draconian
as a dpif can still perform the majority of its intended
functionality without vport notifications.
Ethan Jackson [Mon, 12 Sep 2011 21:09:34 +0000 (14:09 -0700)]
datapath: Hardcode vport multicast group ID on older kernels.
Older kernels do not advertise the multicast groups of families
when requested by userspace. As a workaround, this patch hardcodes
the multicast group ID of the ovs_vport family on these kernels.
Userspace will be able to fall back to this hardcoded value if the
standard mechanism is unavailable.
Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
Ethan Jackson [Thu, 15 Sep 2011 18:21:23 +0000 (11:21 -0700)]
notifiers: Create and destroy nln_notifiers.
This patch changes the interface of netlink-notifier and
rtnetlink-link. Now nln_notifiers are allocated and destroyed by
the module instead of passed in by callers. This allows the
definition of nln_notifier to be hidden, and generally cleans up
the code.
Ethan Jackson [Thu, 15 Sep 2011 18:23:08 +0000 (11:23 -0700)]
notifiers: Rename run and wait functions.
It makes more sense to call nln_notifier_run() and
nln_notifier_wait() simply nln_run() and nln_wait() since they
don't operate on notifiers but the entire nln object. This patch
changes the nln and the rtnetlink-link modules to the new
convention.
datapath: Always use generic stats for devices (vports)
Currently ovs is using device stats for Linux devices and count them
itself in other situations. This leads to overlap with hardware stats,
inconsistencies, etc. It's much better to just always count the packets
flowing through the switch and let userspace do any merging that it wants.
Following patch removes vport->get_stats() interface. vport-stat is changed
to use new `struct ovs_vport_stat` rather than rtnl_link_stats64.
Definitions of rtnl_link_stats64 is removed from OVS. dipf_port->stat is also
removed as aggregate stats are only available at netdev layer.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
Currently the kernel automatically sets the MTU of any internal
interfaces to the minimum of all attached interfaces because the Linux
bridge does this. Userspace can do this with more knowledge and
flexibility.
Ben Pfaff [Thu, 15 Sep 2011 22:55:45 +0000 (15:55 -0700)]
ovs-brcompatd: Delete ports when netdevs on fake bridges disappear.
Until now, when a network device disappeared, netdev_changed_cb() passed
the name of the bridge that contained the network device to ovs-vsctl as
part of the "del-port" command. However, when the network device was
actually a "fake bridge", it would pass the name of the real bridge, which
ovs-vsctl rejected as wrong (expecting the name of the fake bridge) and
not remove the port.
This fixes the problem by dropping the bridge name, which is simpler than
trying to get the name of the fake bridge in this case.
Ben Pfaff [Thu, 25 Aug 2011 17:18:47 +0000 (10:18 -0700)]
rhel: Fix "make distcheck" failure due to regenerating spec files.
We want to regenerate the RPM spec files whenever the version number
changes, hence the dependency on config.status. But that means that we
try to modify the spec files even when the version number doesn't change,
which cause "make distcheck" to fail because it write-protects the source
directory. So this commit instead just "touch"es the spec files when
they don't really change, which still works OK with a write-protected
source directory.
Ben Pfaff [Wed, 7 Sep 2011 17:12:13 +0000 (10:12 -0700)]
ofp-util: Remove obsolete build assertion.
Commit d1e9b9bf3 "nicira-ext: Renumber NXT_FLOW_MOD_TABLE_ID" eliminated
the need for the NXT_SET_FLOW_FORMAT and NXT_FLOW_MOD_TABLE_ID commands to
have different sizes, so asserting that they are different isn't useful
anymore (although it is still correct and always will be).
Ben Pfaff [Thu, 15 Sep 2011 17:41:15 +0000 (10:41 -0700)]
netdev: Allow get_mtu and set_mtu provider functions to be null.
Most netdev provider functions are allowed to be null if the implementation
does not support this feature. This commit adds this feature for get_mtu
and set_mtu, and changes netdev-vport to take advantage of it.
Also, changes netdev_get_mtu() to report an MTU of 0 on error, instead of
leaving the MTU indeterminate.
Jesse Gross [Mon, 1 Aug 2011 07:35:20 +0000 (00:35 -0700)]
datapath: Set vport in skb when executed from userspace.
Currently, the OVS_CB(skb)->vport member is never initialized for
packets coming from userspace. This means that they can never be
sampled by sFlow and generally violates our principle that userspace
packets should be made to look the same as others.
Ben Pfaff [Mon, 12 Sep 2011 23:48:07 +0000 (16:48 -0700)]
ofproto-dpif: Optimize flow revalidation for MAC learning.
Without this commit, every NXAST_LEARN action that adds a flow causes every
facet to be revalidated. With this commit, as long as the "Usage Advice"
in the large comment on struct nx_action_learn in nicira-ext.h is followed,
this no longer happens.
Ben Pfaff [Mon, 12 Sep 2011 23:19:57 +0000 (16:19 -0700)]
Implement new "learn" action.
There are a few loose ends here. First, learning actions cause too much
flow revalidation. Upcoming commits will fix that problem. The following
additional issues have not yet been addressed:
* Resource limits: nothing yet limits the maximum number of flows that
can be learned. It is possible to exhaust all system memory.
* Age reporting: there is no way to find out how soon a learned table
entry is due to be evicted.
To try this action out, here's a recipe for a very simple-minded MAC
learning switch. It uses a 10-second MAC expiration time to make it easier
to see what's going on:
Ben Pfaff [Fri, 19 Aug 2011 17:33:09 +0000 (10:33 -0700)]
ofproto: Reinterpret meaning of OpenFlow hard timeouts with OFPFC_MODIFY.
I finally found a good use for hard timeouts in OpenFlow, but they require
a slight reinterpretation of the meaning of hard timeouts. Until now, a
hard timeout meant that a flow would be removed the specified number of
seconds after a flow was created. Intervening modifications with
OFPFC_MODIFY(_STRICT) had no effect on the hard timeout; the flow would
still be deleted the specified number of seconds after its original
creation.
This commit changes the effect of OFPFC_MODIFY(_STRICT). Now, modifying
a flow resets its hard timeout counter. A flow will time out the specified
number of seconds after creation or after the last time it is modified,
whichever comes later.
Ben Pfaff [Thu, 18 Aug 2011 18:20:12 +0000 (11:20 -0700)]
test-openflowd: Allow specifying port type on --ports option.
This allows a command like "test-openflowd --enable-dummy dummy@br0
--ports=dummy@eth0,dummy@eth1,dummy@eth2" to create a dummy datapath with
a number of dummy ports. This is more useful for testing than a dummy
datapath with just an internal port, since output to "flood" and "normal"
has less pathological results.
Ben Pfaff [Mon, 12 Sep 2011 19:11:50 +0000 (12:11 -0700)]
meta-flow: New library for working with fields by id.
OVS already has a fairly good set of functions for working with fields that
are known at compile time, but support for working with fields that are
known only at runtime is fairly limited (and fairly unneeded). However,
with NXM identifiers becoming more and more widely used throughout Nicira
extensions, it's becoming corresponding more and more common to need to
refer to fields at runtime. This new library represents a first attempt
at a systematic approach for doing so.
Ben Pfaff [Wed, 10 Aug 2011 21:48:33 +0000 (14:48 -0700)]
ofproto: Avoid using list_size() to compute length of 'pending' list.
Currently this only gets checked for incoming OpenFlow OFPT_FLOW_MOD
messages, so it's hard to imagine it being any kind of bottleneck, but the
NXAST_LEARN action that is soon to be added will be able to create flows
more quickly than we normally expect from a controller. (On the other
hand, ofproto-dpif, outside of a special testing mode, always completes
operations immediately, so 'pending' will always have length 0. But this
change still feels right to me for some reason.)
Ben Pfaff [Tue, 16 Aug 2011 23:08:24 +0000 (16:08 -0700)]
ofp-parse: Refactor action parsing to improve compiler warnings.
When a new action is added, compiler warnings show most of the places that
need new code to handle that action. The action parsing code in
ofp-parse.c was the one remaining missing case. This commit fixes that.
Ben Pfaff [Wed, 17 Aug 2011 18:01:17 +0000 (11:01 -0700)]
ofp-util: Further abstract definitions of action properties.
This commit primarily moves the OFPAT_ACTION and NXAST_ACTION invocations
into a new file ofp-util.def. This allows multiple places in the source to
use them.
This commit also adds a new function ofputil_action_code_from_name().
The following commit will add the first user.
Ben Pfaff [Mon, 12 Sep 2011 17:57:28 +0000 (10:57 -0700)]
classifier: Change cls_rule_set_nd_target() to take a pointer.
The other cls_rule_*() functions that take IPv6 addresses take a pointer
to an in6_addr, so cls_rule_set_nd_target() should as well for consistency.
Possibly this is more efficient also, although I guess it doesn't really
make much of a difference either way.
Ethan Jackson [Mon, 12 Sep 2011 23:56:21 +0000 (16:56 -0700)]
ofproto-dpif: Emit set_tunnel when required to.
ofproto-dpif assumed that the datapath initialized the tun_id of a
flow on egress, to its tun_id on ingress. For this reason, if
OpenFlow actions set the tun_id to a flow's ingress tun_id,
ofproto-dpif would fail to emit a set_tunnel action.
Reported-by: Igor Ganichev <iganichev@nicira.com> Reported-by: Pankaj Thakkar <thakkar@nicira.com>
datapath: Strip down vport interface : OVS_VPORT_ATTR_MTU
There is no need to have vport attribute MTU (OVS_VPORT_ATTR_MTU) as
linux net-dev-ioctl can be used to get/set MTU for linux device.
Following patch removes OVS_VPORT_ATTR_MTU from datapath protocol.
This patch also adds netdev_set_mtu interface. So that MTU adjustments
can be done from OVS userspace. get_mtu() interface is also changed, now
get_mtu() returns EOPNOTSUPP rather than returning 0 and setting *pmtu
to INT_MAX in case there is no MTU attribute for given device.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>