netdev_dpdk_vhost_destruct() calls rte_vhost_driver_unregister(), which
can trigger the destroy_device() callback. destroy_device() will try to
take two mutexes already held by netdev_dpdk_vhost_destruct(), causing a
deadlock.
This problem can be solved by dropping the mutexes before calling
rte_vhost_driver_unregister(). The netdev_dpdk_vhost_destruct() and
construct() call are already serialized by netdev_mutex.
This commit also makes clear that dev->vhost_id is constant and can be
accessed without taking any mutexes in the lifetime of the devices.
ofproto: Consider datapath_type when looking for internal ports.
Interfaces with type "internal" end up having a netdev with type "tap"
in the dpif-netdev datapath, so a strcmp will fail to match internal
interfaces.
We can translate the types with ofproto_port_open_type() before calling
strcmp to fix this.
This fixes a minor issue where internal interfaces are considered
non-internal in the userspace datapath for the purpose of adjusting the
MTU.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Kyle Mestery [Mon, 8 Aug 2016 13:48:40 +0000 (06:48 -0700)]
ovs-vsctl: Change log level of vsctl_parent_process_info
While running the ovn-scale-test [1] port-binding tests [2], I notice a
continual stream of messages such as this:
2016-08-04 13:05:28.705 547 INFO rally_ovs.plugins.ovs.scenarios.ovn [-] bind lport_0996bf_cikzNO to sandbox-172.16.200.24 on ovn-farm-node-uat-dal09-compute-325
2016-08-04 13:05:28.712 547 INFO paramiko.transport [-] Connected (version 2.0, client OpenSSH_6.6.1p1)
2016-08-04 13:05:28.805 547 INFO paramiko.transport [-] Authentication (publickey) successful!
2016-08-04T13:05:28Z|00002|vsctl|WARN|/proc/0/cmdline: open failed (No such file or directory)
2016-08-04T13:05:29Z|00002|vsctl|WARN|/proc/0/cmdline: open failed (No such file or directory)
2016-08-04 13:05:29.042 547 INFO rally_ovs.plugins.ovs.scenarios.ovn [-] bind lport_0996bf_tvovcK to sandbox-172.16.200.24 on ovn-farm-node-uat-dal09-compute-325
2016-08-04T13:05:29Z|00002|vsctl|WARN|/proc/0/cmdline: open failed (No such file or directory)
2016-08-04T13:05:29Z|00002|vsctl|WARN|/proc/0/cmdline: open failed (No such file or directory)
2016-08-04 13:05:29.285 547 INFO rally_ovs.plugins.ovs.scenarios.ovn [-] bind lport_0996bf_HwG7AK to sandbox-172.16.200.24 on ovn-farm-node-uat-dal09-compute-325
2016-08-04T13:05:29Z|00002|vsctl|WARN|/proc/0/cmdline: open failed (No such file or directory)
2016-08-04T13:05:29Z|00002|vsctl|WARN|/proc/0/cmdline: open failed (No such file or directory)
2016-08-04 13:05:29.505 547 INFO rally_ovs.plugins.ovs.scenarios.ovn [-] bind lport_0996bf_Lqbv92 to sandbox-172.16.200.24 on ovn-farm-node-uat-dal09-compute-325
2016-08-04T13:05:29Z|00002|vsctl|WARN|/proc/0/cmdline: open failed (No such file or directory)
2016-08-04T13:05:29Z|00002|vsctl|WARN|/proc/0/cmdline: open failed (No such file or directory)
2016-08-04 13:05:29.724 547 INFO rally_ovs.plugins.ovs.scenarios.ovn [-] bind lport_0996bf_6f8uQW to sandbox-172.16.200.24 on ovn-farm-node-uat-dal09-compute-325
2016-08-04T13:05:29Z|00002|vsctl|WARN|/proc/0/cmdline: open failed (No such file or directory)
2016-08-04T13:05:29Z|00002|vsctl|WARN|/proc/0/cmdline: open failed (No such file or directory)
2016-08-04 13:05:29.944 547 INFO rally_ovs.plugins.ovs.scenarios.ovn [-] bind lport_0996bf_nKl2XF to sandbox-172.16.200.24 on ovn-farm-node-uat-dal09-compute-325
Tracing these down, this is due to the check in vsctl_parent_process_info(),
which is verifying if the parent process can be opened. Since ovn-scale-test
runs sandboxes in containers, and these are run as root, there is no /proc/0
in the container. Thus, the check fails, and the error message is printed out.
It's unclear what value this log message provides, so removing it clears up
this problem and is probably the best option.
For the init process with pid of zero, this patch returns "init",
instead of trying to read from /proc/0/cmdline, which does not exist.
Ben Pfaff [Sat, 6 Aug 2016 06:47:59 +0000 (23:47 -0700)]
lflow: Correct register definitions to use subfields for overlaps.
OVN expressions need to know what fields overlap or alias one another.
This is supposed to be done via subfields: if two fields overlap, then the
smaller one should be defined as a subfield of the larger one. For
example, reg0 should be defined as xxreg0[96..127]. The symbol table in
lflow didn't do this, so it's possible for confusion to result. (I don't
have evidence of this actually happening, because it would only occur
in a case where the same bits of a field were referred to with different
names.)
This commit fixes the problem. It deserves a test, but that's somewhat
difficult at this point, so it will actually happen in a future commit.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Ryan Moats <rmoats@us.ibm.com>
Ben Pfaff [Fri, 15 Jul 2016 22:31:42 +0000 (15:31 -0700)]
expr: Give a subfield a direct pointer to its parent in struct expr_symbol.
Until now, symbols that represent subfields and predicates were both
implemented as the same string member, named 'expansion', inside struct
expr. This makes it a little inconvenient to find the parent of a subfield
for two reasons. First, one must actually parse the string, e.g. to
convert "vlan.tci[13..15]" into a pointer to a struct. Second, and more
importantly, to parse the string it's necessary to have access to the
symbol table, which isn't always convenient to pass around. This commit
avoids the problem by breaking apart subfields and predicates and giving
the former a direct pointer to the parent symbol.
We could do the same thing for predicates by storing a pointer to a
pre-built struct expr, but so far it's not necessary.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Ryan Moats <rmoats@us.ibm.com>
Ben Pfaff [Fri, 15 Jul 2016 21:27:55 +0000 (14:27 -0700)]
expr: Track writability as part of expr_symbol.
Until now it was only possible to find out whether an expr_symbol was
read/write or read-only, for subfields, by chasing down whether the
eventual parent field was read/write or read-only. This commit adds
a new 'rw' member that indicates directly.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Ryan Moats <rmoats@us.ibm.com>
Ben Pfaff [Wed, 3 Aug 2016 05:46:18 +0000 (22:46 -0700)]
expr: Initialize 'relop' of allocated exprs in crush_and_string().
Every relop at this point is always EXPR_R_EQ, and therefore it seems that
no code actually examined it, so this doesn't appear to fix an existing
bug, but some code I was working on was affected by the uninitialized
member.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Ryan Moats <rmoats@us.ibm.com>
Ben Pfaff [Wed, 3 Aug 2016 04:53:59 +0000 (21:53 -0700)]
expr: Refine handling of error parameter to expr_annotate().
In most cases expr_annotate() set '*errorp' to NULL if it was successful,
but there was one case where it did not. This corrects that and refines
the comment to better explain the intended behavior.
This didn't affect any existing users because all of them passed in a
pointer that was already NULL.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Ryan Moats <rmoats@us.ibm.com>
Ben Pfaff [Sun, 31 Jul 2016 17:18:40 +0000 (10:18 -0700)]
expr: Fine-tune parser error message for common typo.
It's easy to type "=" in place of "==" in an expression but the expression
parser's error message was far from clear. For multibit numeric fields,
it said:
Explicit `!= 0' is required for inequality test of multibit field
against 0.
For string fields, the parser treated such an expression as "<name> != 0"
and thus it said:
String field <name> is not compatible with numeric constant.
This improves the error message in each case to:
Syntax error at `=' expecting relational operator.
which I hope to be clear.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Ryan Moats <rmoats@us.ibm.com>
Ben Pfaff [Fri, 15 Jul 2016 21:13:02 +0000 (14:13 -0700)]
ofp-actions: Correct member name for write_actions.
For a variable-length action like write_actions, the member name is
supposed to be the name of the variable-length array at the end of the
action structure. It only makes a real difference if the beginning of the
array is not 64-bit aligned, so it did not matter in this case, but it's
better to get it right.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Ryan Moats <rmoats@us.ibm.com>
Ben Pfaff [Wed, 27 Jul 2016 06:55:25 +0000 (23:55 -0700)]
ovsdb-idl: Wake up ovsdb_idl_loop when a transaction commits.
There is a fair amount of code that defers modifying the database when a
transaction cannot be created (because there is already one outstanding).
This code tends to assume that the main loop will wake up again when it
becomes possible again to modify the database, but the actual ovsdb_id_loop
implementation only did this if the database had changed. This is too
conservative a policy and may account for some failures I've seen in tests.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Ryan Moats <rmoats@us.ibm.com>
Ben Pfaff [Mon, 8 Aug 2016 03:44:51 +0000 (20:44 -0700)]
ovn-nbctl: Add "sync" command to wait for previous changes to take effect.
It's slow to add --wait to every ovn-nbctl command; only the last command
needs it. But it's sometimes inconvenient to add it to the last command
if it's in a loop, etc. This makes it possible to separately wait for
the OVN southbound or hypervisors to catch up to the northbound.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Ryan Moats <rmoats@us.ibm.com>
The '-d' flag tells autotest to always keep the testcase output, but
prevents '--recheck' from working. If a user wants to always keep the
output from the tests, the '-d' flag can be passed explicitly. This is
more in line with other test make target ('check',
'check-system-userspace').
CC: Andy Zhou <azhou@ovn.org> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Andy Zhou <azhou@ovn.org>
system-traffic: Flush conntrack after debug ping6.
We want to discard any state created by the initial ping6 (used to wait
for an available IP address). Otherwise some weird state can show up in
the connection tracking tables (such as ICMP connection from link-local
addresses).
Fixes: e5cf8cce2759("system-tests: Add ping through conntrack test.") Reported-by: Joe Stringer <joe@ovn.org> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Joe Stringer <joe@ovn.org>
system-userspace-macros: Check the exit code of ethtool.
If the ethtool command is not available on the system we should fail,
since the userspace testsuite cannot work properly without disabling
offloads.
Also, add ethtool to the list of installed packages on Vagrantfile, to
ensure that offloads don't cause test failures in the vagrant VM when
the kernel is updated.
Fixes: ddcf96d2dcc1 ("system-tests: Disable offloads in userspace tests.") Reported-by: Joe Stringer <joe@ovn.org> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Joe Stringer <joe@ovn.org>
This patch adds some comments to the dpcls_lookup() funtion,
which is one of the most important places where the Userspace
wildcard matching happens.
The purpose is to give some more explanations on its design
and also on how it works.
Joe Stringer [Fri, 5 Aug 2016 00:40:43 +0000 (17:40 -0700)]
system-traffic: Make ping6 vlan test more reliable.
Previously we checked on the underlying interfaces rather than the vlan
interfaces to verify whether IPv6 connectivity is available;
occasionally this would fail on some systems. Wait on the VLAN IP
instead.
Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: <diproiettod@vmware.com>
Maxime Coquelin [Tue, 2 Aug 2016 13:48:27 +0000 (15:48 +0200)]
bridge: No QoS configured is not an error
If no QoS is configured, type value is likely to be an empty
string.
This is not an error though, so use the regular command reply
function, not the error one.
For example, before this patch:
# ovs-appctl -t ovs-vswitchd qos/show vhost-user1
QoS not configured on vhost-user1
ovs-appctl: ovs-vswitchd: server returned an error
After the patch:
# ovs-appctl -t ovs-vswitchd qos/show vhost-user1
QoS not configured on vhost-user1
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Ian Stokes <ian.stokes@intel.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
#0 0x00007efcbadf18d7 in raise () from /lib64/libc.so.6
#1 0x00007efcbadf353a in abort () from /lib64/libc.so.6
#2 0x000000000068d5be in ovs_abort_valist at lib/util.c:335
#3 0x0000000000693d90 in vlog_abort_valist at lib/vlog.c:1204
#4 0x0000000000693e17 in vlog_abort at lib/vlog.c:1218
#5 0x000000000068d3ae in ovs_assert_failure at lib/util.c:72
#6 0x000000000060425c in ds_put_format_valist at lib/dynamic-string.c:168
#7 0x00000000006042e7 in ds_put_format at lib/dynamic-string.c:142
#8 0x00000000005a9e75 in qos_unixctl_show at vswitchd/bridge.c:3185
#9 0x000000000068cda1 in process_command at lib/unixctl.c:347
#11 unixctl_server_run at lib/unixctl.c:400
#12 0x000000000040a3ff in main at vswitchd/ovs-vswitchd.c:113
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Ian Stokes <ian.stokes@intel.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Ciara Loftus [Thu, 4 Aug 2016 10:44:40 +0000 (11:44 +0100)]
netdev-dpdk: Make libnuma dependencies optional
Prior to this patch, OVS with DPDK required the libnuma packages to
build. This patch removes this dependency, making it only a requirement
when the CONFIG_RTE_LIBRTE_VHOST_NUMA option is detected as enabled in
the DPDK build.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Mark Kavanagh [Thu, 4 Aug 2016 09:49:12 +0000 (10:49 +0100)]
netdev-dpdk: fix memory leak
DPDK v16.07 introduces the ability to free memzones.
Up until this point, DPDK memory pools created in OVS could
not be destroyed, thus incurring a memory leak.
Leverage the DPDK v16.07 rte_mempool API to free DPDK
mempools when their associated reference count reaches 0 (this
indicates that the memory pool is no longer in use).
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
ovs_ct_find_existing() issues a warning if an existing conntrack entry
classified as IP_CT_NEW is found, with the premise that this should
not happen. However, a newly confirmed, non-expected conntrack entry
remains IP_CT_NEW as long as no reply direction traffic is seen. This
has resulted into somewhat confusing kernel log messages. This patch
removes this check and warning.
Fixes: 289f2253 ("openvswitch: Find existing conntrack entry after upcall.") Suggested-by: Joe Stringer <joe@ovn.org> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
Ciara Loftus [Wed, 3 Aug 2016 12:29:24 +0000 (13:29 +0100)]
netdev-dpdk: Add support for DPDK 16.07
This commit introduces support for DPDK 16.07 and consequently breaks
compatibility with DPDK 16.04.
DPDK 16.07 introduces some changes to various APIs. These have been
updated in OVS, including:
* xstats API: changes to structure of xstats
* vhost API: replace virtio-net references with 'vid'
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Robert Wojciechowicz <robertx.wojciechowicz@intel.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
With RCU in Open vSwitch it's very easy to protect objects accessed by
a pointer, but sometimes a pointer is not available.
One example is the vhost id for DPDK 16.07. Until DPDK 16.04 a pointer
was used to access a vhost device with RCU semantics. From DPDK 16.07
an integer id (which is an array index) is used to access a vhost
device. Ideally, we want the exact same RCU semantics that we had for
the pointer, on the integer (atomicity, memory barriers, behaviour
around quiescent states)
This commit implements a new type in ovs-rcu: ovsrcu_index. The newly
implemented ovsrcu_index_*() functions should be used to access the
type.
Even though we say "Do not, in general, declare a typedef for a struct,
union, or enum.", I think we're not in the "general" case.
Pravin B Shelar [Wed, 3 Aug 2016 21:37:44 +0000 (14:37 -0700)]
datapath: compat: Use checksum offload for outer header.
Following patch simplifies UDP-checksum routine by unconditionally
using checksum offload for non GSO packets. We might get some
performance improvement due to code simplification.
Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
For ipv6+udp+geneve encapsulation data, the max_mtu should subtract
sizeof(ipv6hdr), instead of sizeof(iphdr).
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
Only the first and last netlink message for a particular conntrack are
actually sent. The first message is sent through nf_conntrack_confirm when
the conntrack is committed. The last one is sent when the conntrack is
destroyed on timeout. The other conntrack state change messages are not
advertised.
When the conntrack subsystem is used from netfilter, nf_conntrack_confirm
is called for each packet, from the postrouting hook, which in turn calls
nf_ct_deliver_cached_events to send the state change netlink messages.
This commit fixes the problem by calling nf_ct_deliver_cached_events in the
non-commit case as well.
Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action") CC: Joe Stringer <joestringer@nicira.com> CC: Justin Pettit <jpettit@nicira.com> CC: Andy Zhou <azhou@nicira.com> CC: Thomas Graf <tgraf@suug.ch> Signed-off-by: Samuel Gauthier <samuel.gauthier@6wind.com> Acked-by: Joe Stringer <joe@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
Pravin B Shelar [Tue, 2 Aug 2016 06:47:12 +0000 (23:47 -0700)]
datapath: compat: Use udp-checksum function for compat case.
udp_set_csum() has bug fix that is not relevant for upstream
(commit c77d947191b0).
So OVS need to use compat function. This function is also
used from UDP xmit path so we have to check USE_UPSTREAM_TUNNEL.
Following patch couple this function to USE_UPSTREAM_TUNNEL symbol
rather than kernel version.
This is not bug, This patch help in code readability.
Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
On some systems I get a sparse warning when compiling
tests/test-netlink-conntrack.c
/usr/include/x86_64-linux-gnu/sys/cdefs.h:307:10: warning: preprocessor
token __always_inline redefined
/usr/include/linux/stddef.h:4:9: this was the original definition
The problem seems to be that Linux upstream commit 283d75737837("uapi/linux/stddef.h: Provide __always_inline to userspace
headers") introduced __always_inline in stddef.h, but glibc headers
didn't like that until e0835a5354ab("Bug 20215: Always undefine
__always_inline before defining it.").
This commit works around the issue by including a glibc header before a
kernel header.
Fixes: 2c06d9a927c5("ovstest: Add test-netlink-conntrack command.") Reported-by: Joe Stringer <joe@ovn.org> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Joe Stringer <joe@ovn.org>
Pravin B Shelar [Tue, 2 Aug 2016 03:12:06 +0000 (20:12 -0700)]
datapath: compat: Detect GSO support at ovs configure
OVS turns on tunnel GSO for statically for kernel older than 3.18.
Some distributions kernel could backport tunnel GSO. To make use
of device offload on such kernel detect the support at configure
stage.
Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
Paul Boca [Tue, 2 Aug 2016 17:45:42 +0000 (17:45 +0000)]
python tests: Skip TCP6 idl tests on Windows
The IPPROTO_IPV6 is not defined on Python for Windows because of
compatibility with older Windows versions.
Here is this issue discussed:https://bugs.python.org/issue6926
Paul Boca [Tue, 2 Aug 2016 17:45:41 +0000 (17:45 +0000)]
python tests: Added fcntl module for Windows
This is needed for lockf function used to lock the PID file on Windows.
ioctl and fcntl functions are not implemented at this time because they are
not used by any script.
Paul Boca [Tue, 2 Aug 2016 17:45:39 +0000 (17:45 +0000)]
python tests: Implemented signal.alarm for Windows
signal.alarm is not available in Windows and would trigger an exception
when called. Implemented this to mentain compatibility between
Windows and Linux for python tests.
Alin Serdean [Tue, 2 Aug 2016 18:19:34 +0000 (18:19 +0000)]
Windows: Local named pipe implementation
Currently in the case of command line arguments punix/unix, on Windows
we create a file, write a TCP port number to connect. This is a security
concern.
This patch adds support for the command line arguments punix/unix trying
to mimic AF_UNIX behind a local named pipe.
This patch drops the TCP socket implementation behind command line
arguments punix/unix and switches to the local named pipe implementation.
Since we do not write anything to the file created by the punix/unix
arguments, switch tests to plain file existence.
Man pages and code comments have been updated.
Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Paul Boca <pboca@cloudbasesolutions.com> Signed-off-by: Gurucharan Shetty <guru@ovn.org>
William Tu [Wed, 3 Aug 2016 06:07:15 +0000 (23:07 -0700)]
fedora.spec: Add OVN include files.
Current 'make rpm-fedora' fails due to files exists in $RPM_BUILD_ROOT
directory but not found in the %files section, resulting in errors below:
RPM build errors:
Installed (but unpackaged) file(s) found:
/usr/include/ovn/actions.h
/usr/include/ovn/expr.h
/usr/include/ovn/lex.h
The patch fixes it and tested with rpmbuild 4.13.0 under Fedora 23.
Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Russell Bryant <russell@ovn.org>
The conntrack unit tests seem to generate different megaflow masks on
Windows. The megaflow masks depend on the internal ordering of the
subtables, which are sorted using qsort(), based on their max priority.
If two subtables have the same priority the ordering between them
depends on the stability properties of qsort(), which apparently are
different between Windows and Linux/*BSD.
This commit uses multiple OpenFlow tables to build our conntrack
pipelines in the tests, which gives us more control over the visited
subtables and also improves clarity.
Reported-by: Alin Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Joe Stringer <joe@ovn.org>
Improve the tutorial of the basic OVN features. Update the contents of
the "Locally attached networks" and "Locally attached networks with VLANs"
in detail. The logical ports of type "l2gateway" is described.
Submitted-at: https://github.com/openvswitch/ovs/pull/144 Signed-off-by: nickcooper-zhangtonghao <nickcooper-zhangtonghao@opencloud.tech> Signed-off-by: Russell Bryant <russell@ovn.org>
ovn-controller: if 'ovn-bridge-mappings' unconfigured, return directly.
If the chassis doesn't configure the 'external-ids:ovn-bridge-mappings' in
the OVSDB, the 'add_bridge_mappings' should return directly to skip some
unnecessary code.
Signed-off-by: nickcooper-zhangtonghao <nickcooper-zhangtonghao@opencloud.tech> Signed-off-by: Russell Bryant <russell@ovn.org>
Mario Cabrera [Tue, 28 Jun 2016 21:14:53 +0000 (15:14 -0600)]
ovsdb: Fix OVSDB disconnect replication bug
Currently disconnecting from the replicator server means closing the
jsonrpc connection and destroying the monitored table names and
blacklisted table names.
This patch makes a distinction between disconnecting from the
remote server, applicable when the replication incurs in an error,
and destroying the remote server info, applicable when ovsdb-server
exits gracefully.
Signed-off-by: Mario Cabrera <mario.cabrera@hpe.com>
Joe Stringer [Mon, 1 Aug 2016 20:58:38 +0000 (13:58 -0700)]
compat: Properly handle fragment lru.
In kernels <=3.16 there is an LRU for managing fragment queues for IPv4
and IPv6. Because the backport code comes from more recent upstream
versions of Linux, this LRU management was missing from ip_frag_queue()
and nf_ct_frag6_queue().
Fixes: 595e069a0634 ("compat: Backport IPv4 reassembly.") Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Pravin B Shelar <pshelar@ovn.org>
Joe Stringer [Tue, 12 Jul 2016 22:26:23 +0000 (15:26 -0700)]
compat: Only call nf_defrag_ipv[46]_enable() once.
This function is just a dummy to ensure that the corresponding netfilter
fragment module is loaded, to initialize the shared structures. But it
doesn't need to be invoked once per namespace; one call per protocol
should do the trick.
Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Pravin B Shelar <pshelar@ovn.org>
Joe Stringer [Tue, 12 Jul 2016 22:26:19 +0000 (15:26 -0700)]
compat: Simplify inet_fragment backports.
The core fragmentation handling logic is exported on all supported
kernels, so it's not necessary to backport the latest version of this.
This greatly simplifies the code due to inconsistencies between the old
per-lookup garbage collection and the newer workqueue based garbage
collection.
As a result of simplifying and removing unnecessary backport code, a few
bugs are fixed for corner cases such as when some fragments remain in
the fragment cache when openvswitch is unloaded.
Some backported ip functions need a little extra logic than what is seen
on the latest code due to this, for instance on kernels <3.17:
* Call inet_frag_evictor() before defrag
* Limit hashsize in ip{,6}_fragment logic
The pernet init/exit logic also differs a little from upstream. Upstream
ipv[46]_defrag logic initializes the various pernet fragment parameters
and its own global fragments cache. In the OVS backport, the pernet
parameters are shared while the fragments cache is separate. The
backport relies upon upstream pernet initialization to perform the
shared setup, and performs no pernet initialization of its own. When it
comes to pernet exit however, the backport must ensure that all
OVS-specific fragment state is cleared, while the shared state remains
untouched so that the regular ipv[46] logic may do its own cleanup. In
practice this means that OVS must have its own divergent implementation
of inet_frags_exit_net().
Joe Stringer [Tue, 12 Jul 2016 22:26:18 +0000 (15:26 -0700)]
compat: Fix IPv6 frag expiry crash.
If a user sends some fragments of an IPv6 message through OVS, but OVS
fails to assemble the IPv6 message and the OVS module is then unloaded
before the fragments expire, it could lead to a kernel panic like the
following:
Justin Pettit [Sun, 26 Jun 2016 05:22:52 +0000 (22:22 -0700)]
ovn: Add support for link-local addresses.
Every IPv6-enabled interface is supposed to have a link-local address
available to it. This commit adds a link local interface to each router
port and scopes link-local routes to the ingress port that received the
packet.
Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
ovn: Don't require clearing inport to hair-pin packets.
Introduce the "flags.loopback" symbol to allow packets to be sent back
on their ingress ports. Previously, one needed to clear "inport" to
hair-pin packets, but this made "inport" not available for future
matching. This approach should be more intuitive, but it will also be
needed in future patches.
This patch also removes functionality from the OVN expression library
that clears the OpenFlow ingress port when the logical input port is
zeroed.
Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Justin Pettit [Tue, 17 May 2016 11:06:13 +0000 (04:06 -0700)]
ovn-northd: Implement basic IPv6 routing.
This commit only supports static MAC bindings. A future commit will add
support for dynamic IPv6/MAC bindings. It has a few other limitations
described in "ovn/TODO".
Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Justin Pettit [Thu, 23 Jun 2016 01:21:40 +0000 (18:21 -0700)]
ovn: Rename "nd" action to "nd_na".
Rename "nd" to "nd_na" to be more descriptive and consistent with other
ND messages and actions. This commit also fixes some minor
documentation issues and limits the action to responding to Neighbor
Solicitation messages.
Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Justin Pettit [Thu, 23 Jun 2016 01:18:14 +0000 (18:18 -0700)]
ovn-controller: Tighten "nd" definition, add "nd_ns" and "nd_na".
According to RFC 4861, Neighbor Discovery messages should only match
when the Hop Limit is 255 to prevent off-link senders from sending ND
messages. This commit limits matching to that Hop Limit.
It also introduces Neighbor Discovery Solicitation ("nd_ns") and
Advertisement ("nd_na") definitions.
The "nd.sll" and "nd.tll" only apply to "nd_ns" and "nd_na",
respectively. This commit limits those symbols appropriately. (Note
that Router and Redirect also use those fields, so we will need to
include them as well when they are added.)
Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Justin Pettit [Sat, 18 Jun 2016 00:17:58 +0000 (17:17 -0700)]
packets: Cleanup ND compose functions.
Rename "compose_nd" and "compose_na" to "compose_nd_ns" and
"compose_nd_na", respecively, to be clearer about their functionality.
This will also make it more consistent when we add Neighbor Discover
Router Solicitation/Advertisement compose functions.
Also change the source and destination IPv6 addresses to take
"struct in6_addr" arguments, which are more common in the code base.
Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Add support for flow-control(mac control frame) to DPDK enabled physical
port types. By default, the flow-control is OFF on both rx and tx side.
The flow control can be enabled/disabled either when adding a port to OVS
or at run time.
For eg:
To enable flow control support at tx side while adding a port, add the
'tx-flow-ctrl' option to the 'ovs-vsctl add-port' command-line as below.
Similarly to enable rx flow control,
'ovs-vsctl add-port br0 dpdk0 -- \
set Interface dpdk0 type=dpdk options:rx-flow-ctrl=true'
And to enable the flow control auto-negotiation,
'ovs-vsctl add-port br0 dpdk0 -- \
set Interface dpdk0 type=dpdk options:flow-ctrl-autoneg=true'
To turn ON the tx flow control at run time(After the port is being added
to OVS), the command-line input will be,
'ovs-vsctl set Interface dpdk0 options:tx-flow-ctrl=true'
The flow control parameters can be turned off by setting 'false' to the
respective parameter. To dsiable the flow control at tx side,
'ovs-vsctl set Interface dpdk0 options:tx-flow-ctrl=false'
ofproto internally modifies 'modify_cookie' field, and adding a
replica to ofproto_flow_mod allows the ofputil_flow_mod argument to be
changed to a const.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Allow adding group mods in OpenFlow bundles. Group mods are executed
atomically with any flow mods in the same bundle. Mods are executed
in order, so that groups appearing in flow actions need to be inserted
in to the bundle before the dependent flow mods.
ovs-ofctl is enhanced to allow the '--bundle' option with group mod
commands. add-groups file format is enhanced to allow each line to be
preceded by one of the keywords "add", "modify", "delete",
"add_or_mod", "insert_bucket", or "remove_bucket".
ovs-ofctl also has a new "bundle" command that reads a file in which
each line contains one flow mod or group mod, and then executes them
all as a single atomic bundle transaction.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Use ofputil_uninit_group_mod() instead of
ofputil_bucket_list_destroy(). Currently these have the same effect,
but this will change in a following patch.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
ofp-util: Do not free() field that is not allocated.
Group properties field array is not dynamically allocated, so it
should not be freed. This has not been a problem, as this function
has not been called by anyone so far, but following patch will.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
ofproto-dpif-xlate: Hash only fields specified for 'hash' selection method.
The mask for non-present fields in struct field_array is always zero,
so hashing a prerequisite field that was not also specified for the
"hash" selection method boiled down to hashing a all-zeroes value and
unwildcarding the prerequisite field. Now that mf_are_prereqs_ok()
already takes care of unwildcarding, we can simplify the code by
hashing only the specified fields.
Also change the test case to include fields that have prerequisities.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
meta-flow: Clean up masking with prerequisities checking.
Change mf_are_prereqs_ok() take a flow_wildcards pointer, so that the
wildcards can be set at the same time as the prerequisiteis are
checked. This makes it easier to write more obviously correct code.
Remove the functions mf_mask_field_and_prereqs() and
mf_mask_field_and_prereqs__(), and make the callers first check the
prerequisites, while supplying 'wc' to mf_are_prereqs_ok(), and if
successful, mask the bits of the field that were read or set using
mf_mask_field_masked().
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
ofproto-dpif: Always forward 'used' from the old_rule.
Use new rule's flags to determine whether stats should be forwarded
from the old, modified rule to the new rule. This captures the fact
that prior to OpenFlow 1.2, which defines the reset counts flag, the
reset counts semantics was assumed by default. However, in that case
the reset counts flag is only present in the new flow, not on the
corresponding flow mod.
Having the above fixed revealed that the 'used' timestamp was not
forwarded from the old rule to the new rule when counts were not being
forwarded. Fix this by always forwarding the 'used' timestamp.