Ian Stokes [Wed, 2 Mar 2016 20:35:54 +0000 (20:35 +0000)]
netdev_dpdk.c: Add QoS functionality.
This patch provides the modifications required in netdev-dpdk.c and
vswitch.xml to allow for a DPDK user space QoS algorithm.
This patch adds a QoS configuration structure for netdev-dpdk and
expected QoS operations 'dpdk_qos_ops'. Various helper functions
are also supplied.
Also included are the modifications required for vswitch.xml to allow a
new QoS implementation for netdev-dpdk devices. This includes a new QoS type
`egress-policer` as well as its expected QoS table entries.
The QoS functionality implemented for DPDK devices is `egress-policer`.
This can be used to drop egress packets at a configurable rate.
The INSTALL.DPDK.md guide has also been modified to provide an example
configuration of `egress-policer` QoS.
Signed-off-by: Ian Stokes <ian.stokes@intel.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Jarno Rajahalme [Wed, 17 Feb 2016 22:08:04 +0000 (14:08 -0800)]
ofp: Add support for bundles extension in OpenFlow 1.3.
ONF Extension 230 adds support for OpenFlow 1.4 bundles to OpenFlow
1.3. Supporting this allows OpenFlow 1.3 controllers to start using
bundles. Also the ovs-ofctl '--bundle' option can now be used with
OpenFlow 1.3.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Jarno Rajahalme [Mon, 29 Feb 2016 19:13:28 +0000 (11:13 -0800)]
odp-util: Use FLOW_MAX_MPLS_LABELS when parsing MPLS ODP keys.
Even though the number of supported MPLS labels may vary between a
datapath and the OVS userspace, it is better to use the
FLOW_MAX_MPLS_LABELS than a hard-coded '3' as the maximum number of
labels to scan.
Requested-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Andy Zhou [Wed, 24 Feb 2016 01:48:11 +0000 (17:48 -0800)]
ovsdb-server: Refactoring and clean up remote status reporting.
When reporting remote status, A listening remote will randomly
pick a session and report its session status. This does not seem
to make much sense. It is probably better to leave those fields
untouched.
Update ovs-vswitchd.conf.db(5) to match the change in implementation.
Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Han Zhou [Fri, 26 Feb 2016 04:49:46 +0000 (20:49 -0800)]
ovn: Connect to remote lports through localnet port.
Before this patch, inter-chassis communication between VIFs of same
lswitch will always go through tunnel, which end up of modeling a
single physical network with many lswitches and pairs of lports, and
complexity in CMS like OpenStack neutron to manage the lswitches and
lports.
With this patch, inter-chassis communication can go through physical
networks via localnet port with a 1:1 mapping between lswitches and
physical networks. The pipeline becomes:
Han Zhou [Fri, 26 Feb 2016 04:26:23 +0000 (20:26 -0800)]
ovn: Avoid ARP responder for packets from localnet port
This is required by next commit that allows lswitch with localnet
port to be attached to multiple chassises. Without this patch, if
an ARP request comes from localnet port, on each chassis there will
be an ARP response, which is not desired.
An new stage ls_in_arp_rsp is introduced for ARP responder before
ls_in_l2_lkup.
Suggested-by: Russell Bryant <russell@ovn.org> Signed-off-by: Han Zhou <zhouhan@gmail.com> Acked-by: Russell Bryant <russell@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
Detected internal interfaces: br-int p1 [ OK ]
Saving flows /usr/share/openvswitch/scripts/ovs-ctl:
line 267: /usr/share/openvswitch/scripts/ovs-save: No such file or directory
[FAILED]
Exiting ovsdb-server (3228) [ OK ]
Starting ovsdb-server [ OK ]
Configuring Open vSwitch system IDs [ OK ]
Exiting ovs-vswitchd (3243) [ OK ]
Saving interface configuration /usr/share/openvswitch/scripts/ovs-ctl:
line 294: /usr/share/openvswitch/scripts/ovs-save: No such file or directory
[FAILED]
Failed to save configuration, not replacing kernel module ... (warning).
Starting ovs-vswitchd [ OK ]
Enabling remote OVSDB managers [ OK ]
Ben Pfaff [Mon, 22 Feb 2016 17:57:50 +0000 (09:57 -0800)]
tests: Move Autotest compatibility macros into tests directory.
compat.at mixes compatibility for m4sh, which is used by Autoconf and
Autotest, with compatibility for Autotest. It makes more sense to separate
them. This moves the Autotest-only compatibility macros into an Autotest
specific file.
Ansis Atteka [Tue, 19 Jan 2016 17:59:12 +0000 (09:59 -0800)]
rhel: provide our own SELinux custom policy package
CentOS, RHEL and Fedora distributions ship with their own Open vSwitch
SELinux policy that is too strict and prevents Open vSwitch to work
normally out of the box.
As a solution, this patch introduces a new package which will "loosen"
up "openvswitch_t" SELinux domain so that Open vSwitch could operate
normally.
Intended use-cases of this package are:
1. to allow users to install newer Open vSwitch on already released Fedora,
RHEL and CentOS distributions where the default Open vSwitch SELinux policy
that shipped with the corresponding Linux distribution is not up to date
and did not anticipate that a newer Open vSwitch version might need to
invoke new system calls or need to access certain system resources that
it did not before; And
2. to provide alternative means through which Open vSwitch developers
can proactively fix SELinux related policy issues without waiting for
corresponding Linux distribution maintainers to update their central
Open vSwitch SELinux policy.
This patch was tested on Fedora 23 and CentOS 7. I verified that now
on Fedora 23 Open vSwitch can create a NetLink socket; and that I did
not see following error messages:
vlog|INFO|opened log file /var/log/openvswitch/ovs-vswitchd.log
ovs_numa|INFO|Discovered 2 CPU cores on NUMA node 0
ovs_numa|INFO|Discovered 1 NUMA nodes and 2 CPU cores
reconnect|INFO|unix:/var/run/openvswitch/db.sock: connecting...
reconnect|INFO|unix:/var/run/openvswitch/db.sock: connected
netlink_socket|ERR|fcntl: Permission denied
dpif_netlink|ERR|Generic Netlink family 'ovs_datapath' does not exist.
The Open vSwitch kernel module is p robably not loaded.
dpif|WARN|failed to enumerate system datapaths: Permission denied
dpif|WARN|failed to create datapath ovs-system: Permission denied
I did not test all Open vSwitch features so there still could be some
OVS configuration that would get "Permission denied" errors.
Since, Open vSwitch daemons on Ubuntu 15.10 by default run under "unconfined"
SELinux domain, then there is no need to create a similar debian package
for Ubuntu, because it works on default Ubuntu installation.
Numan Siddique [Mon, 22 Feb 2016 10:29:37 +0000 (15:59 +0530)]
ovn-northd: Allow lport 'addresses' to store multiple ips in each set
If a logical port has two ipv4 addresses and one ipv6 address
it will be stored as ["MAC IPv41 IPv42 IPv61"] instead of
["MAC IPv41", "MAC IPv42", "MAC IPv61"].
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
[blp@ovn.org made changes to comments and ovn.at] Signed-off-by: Ben Pfaff <blp@ovn.org>
Lance Richardson [Thu, 25 Feb 2016 15:57:28 +0000 (10:57 -0500)]
tests: Gracefully terminate daemons in OVN tests
Daemons started in OVN tests are currently killed (via "on_exit kill"
in start_daemon()). This is problematic for tools (such as gcov) that
rely on exit() being called.
Fix by using "ovs-appctl ... exit" to gracefully terminate the daemons.
Jarno Rajahalme [Thu, 25 Feb 2016 00:10:42 +0000 (16:10 -0800)]
xlate: Always recirculate after an MPLS POP to a non-MPLS ethertype.
So far we have tried to optimize MPLS POP action not to recirculate
unless later matching actually needs the inner headers. This made the
code complex and error-prone. Also the cases where this optimization
would have been useful seem rare, as one would typically want to do
something else with the inner packet than blindly send it to some
output port.
With this change multiple consecutive MPLS POPs do not need
recirculation in between, so even if the blind output case is now
little bit less optimal, the multiple POP case is correspondingly
faster with this change.
Suggested-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Jarno Rajahalme [Thu, 25 Feb 2016 00:10:42 +0000 (16:10 -0800)]
odp-util: Format and scan multiple MPLS labels.
So far we have been limited to including only one MPLS label in the
textual datapath flow format. Allow upto 3 labels to be included so
that testing with multiple labels becomes easier.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Jarno Rajahalme [Thu, 25 Feb 2016 00:10:42 +0000 (16:10 -0800)]
tests: Fix MPLS tests.
Some MPLS tests used non-MPLS ethertype for popping a label from a
multi-label stack. Also, reveal actions in some MPLS tests. This
will make later patches more easily understandable.
Fix the mpls-xlate banner and remove '-generate' option from MPLS
tests as it is no longer needed to create recirculation state.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Andy Zhou [Mon, 22 Feb 2016 08:35:28 +0000 (00:35 -0800)]
ovsdb: avoid unnecessary call to ovsdb_monitor_get_update()
Optimizing ovsdb_jsonrpc_mintor_flush_all() by avoiding calling
ovsdb_monitor_get_update() on monitors that do not have any
unflushed updates. This change saves CPU cycles on ovsdb-server's
main loop, but should not introduce any client visible changes.
Reported-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Liran Schour <lirans@il.ibm.com> Acked-by: Ben Pfaff <blp@ovn.org>
Andy Zhou [Mon, 22 Feb 2016 08:24:06 +0000 (00:24 -0800)]
ovsdb: Fix one off error in tracking monitor changes
dbmon's changes should be stored with the next transaction number,
rather than the current transaction number. This bug causes the
changes of a transaction stored in a monitor to be unnoticed by
the jsonrpc connections that is responsible for flush the monitor
content.
However, the bug was not noticed until it was exposed by a later
optimization patch: "avoid unnecessary call to ovsdb_monitor_get_update()."
The lack of optimization means that the update is still generated
when 'unflushed' equals to n_transactions + 1, which should have
indicated the monitor has been flushed already.
Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Liran Schour <lirans@il.ibm.com> Acked-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Tue, 9 Feb 2016 19:44:40 +0000 (11:44 -0800)]
travis: Automatically recheck failed tests.
This should make the automatic testsuite more reliable on Travis. It's
better to fix tests to be more reliable, of course, but in practie it's
difficult to make all of them 100% reliable.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Linux kernel network devices in a guest should have the number of
multi-purpose channels configured when used with DPDK multiqueue on the host.
This commit adds an example of how this can be done. Also add QEMU 2.5
requirements for multiqueue with DPDK in NEWS.
Signed-off-by: Ian Stokes <ian.stokes@intel.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Ilya Maximets [Wed, 24 Feb 2016 14:14:43 +0000 (17:14 +0300)]
netdev-dpdk: vhost-user: Fix sending packets to queues not enabled by guest.
Currently virtio driver in guest operating system have to be configured
to use exactly same number of queues. If number of queues will be less,
some packets will get stuck in queues unused by guest and will not be
received.
Fix that by using new 'vring_state_changed' callback, which is
available for vhost-user since DPDK 2.2.
Implementation uses additional mapping from configured tx queues to
enabled by virtio driver. This requires mandatory locking of TX queues
in __netdev_dpdk_vhost_send(), but this locking was almost always anyway
because of calling set_multiq with n_txq = 'ovs_numa_get_n_cores() + 1'.
OVS_VHOST_MAX_QUEUE_NUM = 1024 chosen based on the fact that this is
the maximum number of queues supported by QEMU.
Saloni Jain [Thu, 18 Feb 2016 10:24:26 +0000 (15:54 +0530)]
Implement OFPT_TABLE_STATUS Message.
On change in a table state, the controller needs to be informed with
the OFPT_TABLE_STATUS message. The message is sent with reason
OFPTR_VACANCY_DOWN or OFPTR_VACANCY_UP in case of change in remaining
space eventually crossing any one of the threshold.
Signed-off-by: Saloni Jain <saloni.jain@tcs.com> Co-authored-by: Rishi Bamba <rishi.bamba@tcs.com> Signed-off-by: Rishi Bamba <rishi.bamba@tcs.com>
[blp@ovn.org added vacancy event initialization and tests
and updated NEWS] Signed-off-by: Ben Pfaff <blp@ovn.org>
Simon Horman [Wed, 20 Jan 2016 06:15:01 +0000 (15:15 +0900)]
flow: add miniflow_pad_from_64
Provide leading padding to allow pushing a value to a miniflow where
the value is not aligned to 64 bytes and no value has already been
pushed to the same word.
This will be used by a follow-up patch to allow layer 3 packet - that is
packets without an ethernet header - to be represented in flows.
Signed-off-by: Simon Horman <simon.horman@netronome.com> Acked-by: Jarno Rajahalme <jarno@ovn.org>
netdev-dpdk: Do not add vhost-user ports with '/' or '\' in name.
This check prevents an obvious way for a vhost-user socket to escape the
intended directory.
There might be other ways to escape the directory (none comes to mind at
the moment), but this is a problem that should be properly solved by
mandatory access control.
A similar check is done for a bridge name, since that name is used as
part of a socket as well.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Flavio Leitner <fbl@sysclose.org>
Mauricio Vásquez [Tue, 23 Feb 2016 22:06:38 +0000 (23:06 +0100)]
tests/dpdk/ring_client: extend range of supported dpdkr ports
Current implementation of the ring_client test only supports until the
dpdkr255 port, this patch extends it to support the full range of possible
dpdkr ports.
Signed-off-by: Mauricio Vasquez B <mauricio.vasquezbernal@studenti.polito.it> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Lance Richardson [Mon, 15 Feb 2016 15:08:51 +0000 (10:08 -0500)]
lib: Fix netbsd compilation error.
NetBSD requires <netinet/in.h> to be included before <netinit/ip6.h>.
Without this fix we have:
In file included from lib/netdev-vport.c:25:0:
/usr/include/netinet/ip6.h:82:18: error: field 'ip6_src' has incomplete type
/usr/include/netinet/ip6.h:83:18: error: field 'ip6_dst' has incomplete type
Signed-off-by: Lance Richardson <lrichard@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
ofproto-dpif-xlate: Fix crash when using multicast snooping.
The revalidator thread may set may_learn and call xlate_actions with no packet
data. If the revalidated flow is IGMPv3 or MLD, vswitchd will crash when trying
to access the NULL packet.
Only process IGMP and MLD flows when there is a packet. This is a similar
behavior than what we have for other special packets.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Reported-by: Yi Ba <yby.developer@yahoo.com>
Reported-at: http://openvswitch.org/pipermail/discuss/2016-January/020023.html Fixes: 06994f879c9d ("mcast-snooping: Add Multicast Listener Discovery support") Signed-off-by: Ben Pfaff <blp@ovn.org>
Ilya Maximets [Mon, 8 Feb 2016 15:30:29 +0000 (18:30 +0300)]
dpif-netdev: Reload each thread only once in do_add_port.
While adding of pmd interface with multiple queues several queues
may be assigned to one thread and this thread will be reloaded
one time for each added queue.
Alin Serdean [Thu, 11 Feb 2016 03:09:32 +0000 (03:09 +0000)]
build-windows: Enable parallel jobs for msbuild
This patch enables parallel build from the command line.
If vstudio_config is defined change from:
make ovsext_make to make ovsext and also update the dependecy for it,
since the project requires OvsDpInterface.h to be built.
Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Sairam Venugopal <vsairam@vmware.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Russell Bryant [Fri, 18 Dec 2015 13:33:30 +0000 (08:33 -0500)]
ovsdb.at: Run Python tests for Python 2 and 3.
ovsdb.at includes some macros for running some identical test cases for
both C and Python. Update these macros to run the test case for both
Python 2 and 3. Retain the existing behavior for the direct use of the
_PY versions of these macros to only run against Python 2 without any
changes needed.
Signed-off-by: Russell Bryant <russell@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Russell Bryant [Fri, 18 Dec 2015 00:58:33 +0000 (19:58 -0500)]
configure: Check for presence of Python 3.
The configure script already checked for Python 2 (>=2.7). Add another
check for Python 3 (>=3.4). This will be used later for automatically
running tests with Python 3 as well if available.
Signed-off-by: Russell Bryant <russell@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Russell Bryant [Thu, 17 Dec 2015 17:22:31 +0000 (12:22 -0500)]
tests: Deal with Python output differences.
This test checks the output based on Python's string representation of
an array of two unicode strings. These strings have a "u" prefix in
Python 2, but not Python 3. In Python 3, all strings are unicode.
Use sed on the output to strip the "u" from Python 2 output when
checking for the expected result.
Signed-off-by: Russell Bryant <russell@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
William Tu [Thu, 18 Feb 2016 02:00:22 +0000 (18:00 -0800)]
gcc: Fix compile errors due to anonymous union initilization.
gcc 4.4.7 lets you initialize named fields, and assign to anonymous union members,
but cannot statically initialize a named member of an anonymous union. This causes
errors when doing make:
fproto/fail-open.c: In function ‘send_bogus_packet_ins’:
ofproto/fail-open.c:130: error: unknown field ‘pin’ specified in initializer
ofproto/fail-open.c:131: error: unknown field ‘up’ specified in initializer
ofproto/fail-open.c:132: error: unknown field ‘packet’ specified in initializer
ofproto/fail-open.c:132: warning: missing braces around initializer
ofproto/fail-open.c:132: warning: (near initialization for ‘am.<anonymous>.pin.up’)
ofproto/fail-open.c:134: error: extra brace group at end of initializer
Examaple: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42875
We can either assign a name to the union or, in this patch, remove the unnecessary union.
Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Tue, 16 Feb 2016 19:13:35 +0000 (11:13 -0800)]
tests: Better tolerate file system restriction on file name length.
ecryptfs on Linux restricts file names to 143 bytes, but these two tests
used a 150-byte name. This commit fixes the specific problem on ecryptfs
by reducing the name it test to 143 bytes. It also fixes the more general
problem of name length restrictions by skipping the test, rather than
failing it, if a directory with the 143-byte name cannot be created, since
the most likely problem is that the name is too long for the file system.
Reported-by: Zoltán Balogh <zoltan.balogh@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Tue, 9 Feb 2016 00:52:45 +0000 (16:52 -0800)]
hmap: Add extra build-time iteration checks for types derived from hmap.
Some of our data structures derived from hmap use the same member names.
This means it's possible to confuse them in iteration, e.g. to iterate a
shash with SIMAP_FOR_EACH. Of course this will crash at runtime, but it
seems even better to catch it at compile time.
An alternative would be to use unique member names, e.g. shash_map and
simap_map instead of just map. I like short names, though.
It's kind of nasty that we need support from the hmap code to do this.
An alternative would be to insert the build assertions as statements before
the for loop. But that would cause nasty surprises if someone forgets the
{} around a block of statements; even though the OVS coding style requires
them in all cases, I suspect that programmers doing debugging, etc. tend
to omit them sometimes.
It's not actually necessary to have multiple variants of these macros,
e.g. one can write a C99-compliant HMAP_FOR_EACH that accepts 3 or 4 or
more arguments. But such a macro is harder to read, so I don't know
whether this is a good tradeoff.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Ben Pfaff [Thu, 18 Feb 2016 18:12:04 +0000 (10:12 -0800)]
ofp-actions: Introduce macro for padding struct members.
An upcoming commit will add another case where it's desirable to ensure
that a variable-length array is aligned on an 8-byte boundary. This macro
makes that a little easier.
Signed-off-by: Ben Pfaff <blp@ovn.org> CC: Joe Stringer <joe@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
Ben Pfaff [Sat, 20 Feb 2016 00:10:06 +0000 (16:10 -0800)]
Implement serializing the state of packet traversal in "continuations".
One purpose of OpenFlow packet-in messages is to allow a controller to
interpose on the path of a packet through the flow tables. If, for
example, the controller needs to modify a packet in some way that the
switch doesn't directly support, the controller should be able to
program the switch to send it the packet, then modify the packet and
send it back to the switch to continue through the flow table.
That's the theory. In practice, this doesn't work with any but the
simplest flow tables. Packet-in messages simply don't include enough
context to allow the flow table traversal to continue. For example:
* Via "resubmit" actions, an Open vSwitch packet can have an
effective "call stack", but a packet-in can't describe it, and
so it would be lost.
* A packet-in can't preserve the stack used by NXAST_PUSH and
NXAST_POP actions.
* A packet-in can't preserve the OpenFlow 1.1+ action set.
* A packet-in can't preserve the state of Open vSwitch mirroring
or connection tracking.
This commit introduces a solution called "continuations". A continuation
is the state of a packet's traversal through OpenFlow flow tables. A
"controller" action with the "pause" flag, which is newly implemented in
this commit, generates a continuation and sends it to the OpenFlow
controller in a packet-in asynchronous message (only NXT_PACKET_IN2
supports continuations, so the controller must configure them with
NXT_SET_PACKET_IN_FORMAT). The controller processes the packet-in,
possibly modifying some of its data, and sends it back to the switch with
an NXT_RESUME request, which causes flow table traversal to continue. In
principle, a single packet can be paused and resumed multiple times.
Another way to look at it is:
- "pause" is an extension of the existing OFPAT_CONTROLLER
action. It sends the packet to the controller, with full
pipeline context (some of which is switch implementation
dependent, and may thus vary from switch to switch).
- A continuation is an extension of OFPT_PACKET_IN, allowing for
implementation dependent metadata.
- NXT_RESUME is an extension of OFPT_PACKET_OUT, with the
semantics that the pipeline processing is continued with the
original translation context from where it was left at the time
it was paused.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
Ben Pfaff [Fri, 19 Feb 2016 23:56:52 +0000 (15:56 -0800)]
ofp-prop: Add support for putting and parsing nested properties.
It hadn't occurred to me before that any special support was actually
necessary or useful for nested properties, but the functions introduced in
this commit are nice wrappers to deal with the extra 4-byte padding that
ensures that the nested properties begin on 8-byte boundaries just like
the outer properties.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
Ben Pfaff [Thu, 18 Feb 2016 23:13:09 +0000 (15:13 -0800)]
ofpbuf: New function ofpbuf_const_initializer().
A number of times I've looked at code and thought that it would be easier
to understand if I could write an initializer instead of
ofpbuf_use_const(). This commit adds a function for that purpose and
adapts a lot of code to use it, in the places where I thought it made
the code better.
In theory this could improve code generation since the new function can
be inlined whereas ofpbuf_use_const() isn't. But I guess that's probably
insignificant; the intent of this change is code readability.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
Ben Pfaff [Fri, 5 Feb 2016 23:30:26 +0000 (15:30 -0800)]
tests: Add mirror-related keywords to all the mirroring tests.
Autotest isn't too smart, so if you try to use "mirroring" as a keyword
before this commit it doesn't select most of the tests due to the comma in
the test names.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
Ben Pfaff [Sat, 6 Feb 2016 03:16:01 +0000 (19:16 -0800)]
ofproto-dpif-xlate: Don't consider mirrors used when excluded by VLAN.
Mirrors can be configured to select packets for mirroring on the basis
of multiple criteria: input ports, output ports, and VLANs. A packet P
is to be mirrored if there exists a mirror M such that either:
- P ingresses on an input port selected by M, or
- P egresses on an output port selected by M
AND P is in a VLAN selected by M.
In addition, every mirror has a destination, which can be an output port
or an output VLAN. Either way, if a packet is mirrored to a particular
destination, it is done only once, even if different mirrors both select
a packet and have the same destination.
Since commit efbc3b7c4006c (ofproto-dpif-xlate: Rewrite mirroring to better
fit flow translation.), these requirements have been implemented
incorrectly: if a packet satisfies one of the bulleted requirements
above for mirror M1, but not the VLAN selection requirement for M1,
then it was not sent to M's destination, but it was still considered
as having been sent to M1's destination for the purpose of avoid output
duplication. Thus, if P satisfied *all* of the requirements for a
second mirror M2, if M1 and M2 had the same destination, the packet was
still not mirrored. This commit fixes that problem.
(The issue only occurred if M1 happened to have a smaller index than
M2 in OVS's internal data structures. That's just a matter of luck.)
Reported-by: Huanle Han <hanxueluo@gmail.com>
Reported-at: http://openvswitch.org/pipermail/dev/2016-January/064531.html Fixes: 7efbc3b7c4006c (ofproto-dpif-xlate: Rewrite mirroring to better fit flow translation.) Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
Joe Stringer [Sat, 13 Feb 2016 12:47:13 +0000 (04:47 -0800)]
datapath: lisp: Relax MTU constraints.
Currently, even if the entire path supports jumbo frames, the LISP netdev
limits the path MTU to 1500 bytes, and cannot be configured otherwise.
Relax the constraints on modifying the device MTU, and set it to the
maximum by default.
Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
Joe Stringer [Sat, 13 Feb 2016 12:32:36 +0000 (04:32 -0800)]
datapath: stt: Relax MTU constraints.
Currently, even if the entire path supports jumbo frames, the STT netdev
limits the path MTU to 1500 bytes, and cannot be configured otherwise.
Relax the constraints on modifying the device MTU, and set it to the
maximum by default.
Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
David Wragg [Thu, 18 Feb 2016 17:43:29 +0000 (17:43 +0000)]
datapath: geneve: Refine MTU limit.
Upstream commit:
Calculate the maximum MTU taking into account the size of headers
involved in GENEVE encapsulation, as for other tunnel types.
Changes in v3:
- Correct comment style
Changes in v2:
- Conform more closely to ip_tunnel_change_mtu
- Exclude GENEVE options from max MTU calculation
Signed-off-by: David Wragg <david@weave.works> Acked-by: Jesse Gross <jesse@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: aeee0e66c6b4 ("geneve: Refine MTU limit") Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
David Wragg [Wed, 10 Feb 2016 00:05:58 +0000 (00:05 +0000)]
datapath: Set a large MTU on tunnel devices.
Upstream commit:
Prior to 4.3, openvswitch tunnel vports (vxlan, gre and geneve) could
transmit vxlan packets of any size, constrained only by the ability to
send out the resulting packets. 4.3 introduced netdevs corresponding
to tunnel vports. These netdevs have an MTU, which limits the size of
a packet that can be successfully encapsulated. The default MTU
values are low (1500 or less), which is awkwardly small in the context
of physical networks supporting jumbo frames, and leads to a
conspicuous change in behaviour for userspace.
Instead, set the MTU on openvswitch-created netdevs to be the relevant
maximum (i.e. the maximum IP packet size minus any relevant overhead),
effectively restoring the behaviour prior to 4.3.
Signed-off-by: David Wragg <david@weave.works> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: 7e059158d57b ("vxlan, gre, geneve: Set a large MTU on ovs-created
tunnel devices") Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
David Wragg [Wed, 10 Feb 2016 00:05:57 +0000 (00:05 +0000)]
datapath: geneve: Relax MTU constraints.
Upstream commit:
Allow the MTU of geneve devices to be set to large values, in order to
exploit underlying networks with larger frame sizes.
GENEVE does not have a fixed encapsulation overhead (an openvswitch
rule can add variable length options), so there is no relevant maximum
MTU to enforce. A maximum of IP_MAX_MTU is used instead.
Encapsulated packets that are too big for the underlying network will
get dropped on the floor.
Signed-off-by: David Wragg <david@weave.works> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: 55e5bfb53cff ("geneve: Relax MTU constraints") Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
David Wragg [Wed, 10 Feb 2016 00:05:55 +0000 (00:05 +0000)]
datapath: vxlan: Relax MTU constraints.
Upstream commit:
Allow the MTU of vxlan devices without an underlying device to be set
to larger values (up to a maximum based on IP packet limits and vxlan
overhead).
Previously, their MTUs could not be set to higher than the
conventional ethernet value of 1500. This is a very arbitrary value
in the context of vxlan, and prevented vxlan devices from being able
to take advantage of jumbo frames etc.
The default MTU remains 1500, for compatibility.
Signed-off-by: David Wragg <david@weave.works> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: 72564b59ffc4 ("vxlan: Relax MTU constraints") Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
Ben Pfaff [Fri, 4 Dec 2015 22:12:05 +0000 (14:12 -0800)]
expr: Generalize wording of error message in expand_symbol().
The existing wording was very specific to the actual operation being
performed. While this is nice for users, it becomes difficult to maintain
as more and more operations are added. This commit makes the wording less
specific, because a third operation will start using this function in an
upcoming commit.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Justin Pettit <jpettit@ovn.org>