ovs_threads: Avoid running pthread destructors from main thread exit.
Windows uses pthreads-win32 library to provide the Linux pthread
functionality. It is observed that when the main thread calls
a pthread destructor after it exits, undefined behavior is seen
(e.g., junk values in data, causing pthread deadlocks).
Similar behavior has been seen by
other people as seen in the following email thread:
https://sourceware.org/ml/pthreads-win32/2003/msg00001.html
To avoid this, this commit de-registers the thread destructor
when the main thread exits (via the atexit handler).
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Although the ovs-ctl/ovs-lib takes care of creating the rundir,
it is correct to let the systemd manages the directory and let
the rpm know about the ownership too.
Signed-off-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
ovs-docker: Add the ability to set the mac address.
For testing OVN, it is useful to set the mac address
of the container. Since ovs-docker hasn't been part
of any released versions of OVS, it is probably OK
to change the options style.
Ansis Atteka [Tue, 26 May 2015 23:49:49 +0000 (16:49 -0700)]
debian: install openvswitch kernel module under "updates" directory
This patch fixes a bug where "modprobe openvswitch" command on Ubuntu
distribution would have sometimes tried to load OVS kernel module that
shipped together with Linux Kernel, even though one had also installed
OVS datapath debian package created with module-assistant. Because of
this issue force-reload-kmod command occasionally malfunctioned and
failed to load the right kernel module.
This bug happened *occasionally* because the default Ubuntu depmod
configuration in /etc/depmod.d/ubuntu.conf is set to look for kernel
modules first in "updates" directory, then in "ubuntu" directory and
then in other directories. If there were two openvswitch.ko modules
in "other directories", then modprobe would have loaded kernel
module that was nondeterministically listed first by file system.
Signed-off-by: Ansis Atteka <aatteka@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Wed, 20 May 2015 20:14:29 +0000 (13:14 -0700)]
Revert "ovs-ofctl: Always prints recirc_id in decimal"
As there is the potential for this field to be maskable in future, and
the dpctl "-m" output prints a mask for it, return it to hexadecimal.
The next patch will make this consistent to the recirc action by making
the action print the recirc_id in hex as well.
Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Andy Zhou <azhou@nicira.com>
Joe Stringer [Fri, 22 May 2015 17:24:34 +0000 (10:24 -0700)]
dpctl: Don't print UFID if not present.
With verbose dpctl, if userspace runs against an older kernel, every
entry will have "ufid:<empty>" at the beginning. This is unnecessary and
introduces an additional format for scripts to parse. Drop it.
Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Joe Stringer [Tue, 19 May 2015 21:20:31 +0000 (14:20 -0700)]
extract-ofp-fields: Port to python3.
Mostly "print foo" -> "print(foo)" and "iteritems() -> items()". The
latter may be less efficient in python2, but we're not dealing with
massive numbers of items here so it shouldn't noticably slow the build.
Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
netdev-dpdk: Adapt the requested number of tx and rx queues.
This commit changes the semantics of 'netdev_set_multiq()' to allow OVS
DPDK to run on device with limited multi queue support.
* If a netdev doesn't have the requested number of rxqs it can simply
inform the datapath without failing.
* If a netdev doesn't have the requested number of txqs it should try
to create as many as possible and use locking.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Right now ethernet and ring devices use a mutex, while vhost devices use
a mutex or a spinlock to protect statistics. This commit introduces a
single spinlock that's always used for stats updates.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
We used to reserve DPDK lcore 0 for non pmd operations, making it
difficult to use core 0 for packet processing.
DPDK 2.0 properly support non EAL threads with lcore LCORE_ID_ANY.
Using non EAL threads for non pmd threads, we do not need to reserve
any core for non pmd operations
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Jarno Rajahalme [Fri, 22 May 2015 18:22:40 +0000 (11:22 -0700)]
datapath: Support masked set actions.
OVS kernel module support for masked set actions in already upstream
in Linux (commit 83d2b9ba1abca241df44a502b6da950a25856b5b). This
patch adds the same for the OVS tree kernel module.
The existing set action sets many fields at once. When only a subset
of the IP header fields, for example, should be modified, all the IP
fields need to be exact matched so that the other field values can be
copied to the set action. A masked set action allows modification of
an arbitrary subset of the supported header bits without requiring the
rest to be matched.
Masked set action is now supported for all writeable key types, except
for the tunnel key. The set tunnel action is an exception as any
input tunnel info is cleared before action processing starts, so there
is no tunnel info to mask.
The kernel module converts all (non-tunnel) set actions to masked set
actions. This makes action processing more uniform, and results in
less branching and duplicating the action processing code. When
returning actions to userspace, the conversion is inverted. We use a
kernel internal action code to be able to tell the userspace provided
and converted masked set actions apart.
Having the same RSS hash after recirculation can cause unnecessary
collisions in the exact match cache. A simple solution is to rehash it
with the recirculation depth if it is non-zero.
Suggested-by: Ethan Jackson <ethan@nicira.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ethan Jackson [Wed, 20 May 2015 23:55:17 +0000 (16:55 -0700)]
dpif-netdev: Clear flow batches before execute.
When executing actions, it's possible a recirculation will occur
causing dp_netdev_input() to be called multiple times. If the batch
pointers embedded in dp_netdev_flow aren't cleared, it's possible
packets after the recirculation will be reinserted into a batch
associated with the original lookup. This could be very bad.
This patch fixes the problem by zeroing out flow batch pointers before
calling packet_batch_execute(). This probably has a slightly negative
performance impact, though I haven't tried it.
Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Kevin Traynor [Thu, 21 May 2015 16:26:48 +0000 (17:26 +0100)]
netdev-dpdk: Use default NIC configuration.
This patch simplifies Rx/Tx NIC configuration by removing
custom values and using the defaults provided by the DPDK
PMDs. This also enables Rx vectorisation which improves
performance.
Signed-off-by: Kevin Traynor <kevin.traynor@intel.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Ciara Loftus [Wed, 13 May 2015 13:54:56 +0000 (14:54 +0100)]
dpif-netdev: Increase the number of EMC entries
Prior to this commit, the number of possible entries in the Exact
Match Cache stood at 1024 per thread exacting to 0.18Mb. A typical
server system will have 2.5Mb cache per core meaning a larger EMC will
comfortably fit in. This patch increases the number of entries to 8192
per thread (1.4Mb) which in turn yields improved throughput when
processing multiple flows of traffic.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ethan Jackson [Sat, 16 May 2015 15:18:20 +0000 (08:18 -0700)]
dpdk: Ditch MAX_PKT_BURST macro.
The MAX_PKT_BURST and NETDEV_MAX_RX_BATCH macros had a confusing
relationship. They basically purport to do the same thing, making it
unclear which is the source of truth.
Furthermore, while NETDEV_MAX_RX_BATCH was 256, MAX_PKT_BURST was 32,
meaning we never process a batch larger than 32 packets further adding
to the confusion.
This patch resolves the issue by removing MAX_PKT_BURST completely,
and shrinking the new NETDEV_MAX_BURST macro to only 32. This should
have no change in the execution path except shrinking a couple of
structs and memory allocations (can't hurt).
Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Andy Zhou [Tue, 19 May 2015 01:10:29 +0000 (18:10 -0700)]
ovs-ofctl: Always prints recirc_id in decimal
The output of 'ovs-ofctl dump-flows' command prints recirc_id in decimal
in action parts of the output, while prints that in hex in matching
parts of the same output.
This patch fixes the inconsistency by always printing recirc_id
values in decimal.
Reported-by: Justin Pettit <jpettit@nicira.com> Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
dpif-netdev: Share emc and fast path output batches.
Until now the exact match cache processing was able to handle only four
megaflows. The rest of the packets was passed to the megaflow
classifier.
The limit was arbitraly set to four also because the algorithm used to
group packets in output batches didn't perform well with a lot of
megaflows.
After changing the algorithm and after some performance testing it seems
much better just to share the same output batches between the exact
match cache and the megaflow classifier.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
dpif-netdev: Store batch pointer in dp_netdev_flow.
The userspace datapath
1. receives a batch of packets.
2. finds a 'netdev_flow' (megaflow) for each packet.
3. groups the packets in output batches based on the 'netdev_flow'.
Until now the grouping (2) was done using a simple algorithm with a
O(N^2) runtime, where N is the number of distinct megaflows of the packets
in the incoming batch. This could quickly become a bottleneck, even with
a small number of megaflows.
With this commit the datapath simply stores in the 'netdev_flow' (the
megaflow) a pointer to the output batch, if one has been created for the
current input batch. The pointer will be cleared when the output batch
is sent.
In a simple phy2phy test with 128 megaflows the throughput is more than
doubled.
The reason that stopped us from doing this change was that the
'netdev_flow' memory was shared between multiple threads: this is no
longer the case with the per-thread classifier.
Also, this commit reorders struct dp_netdev_flow to group toghether the
members used in the fastpath.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
dpif-netdev: Store pkt_metadata structure in dp_netdev_port.
Initializing a struct pkt_metadata for every packet can be surprisingly
expensive. It's much faster to keep a copy for each port and copying it
on each packet.
Suggested-by: Pravin Shelar <pshelar@nicira.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
In 'struct ofpbuf' the 'frame' pointer was used to parse different kinds of
data (Ethernet, OpenFlow, Netlink attributes). For Ethernet packets the
'frame' pointer was supposed to have the same value as the 'data'
pointer.
Since 'struct dp_packet' is only used for Ethernet packets, there's no
need for a separate 'frame' pointer: we can use the 'data' pointer
instead.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Jarno Rajahalme [Mon, 18 May 2015 17:24:02 +0000 (10:24 -0700)]
ofproto: Fix memory leak in flow deletion.
Fix a memory leak that was introduced in commit 834fe5cb997b (ofproto:
Additional simplifications.). We used to unref the flow
asynchronously, but forgot to do it when the support for asynchronous
operations was removed.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Pravin B Shelar [Fri, 15 May 2015 13:32:32 +0000 (06:32 -0700)]
datapath: Fix Sparse warning.
CHECK /home/pravin/ovs/w8/datapath/linux/flow_table.c
/home/pravin/ovs/w8/datapath/linux/flow_table.c:536:6: warning: symbol
'ovs_flow_cmp_unmasked_key' was not declared. Should it be static?
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
Oleg Strikov [Fri, 8 May 2015 19:05:13 +0000 (12:05 -0700)]
INSTALL.DPDK: Notes on running ovs-vswitchd/dpdk inside a VM
Additional configuration is required if you want to run ovs-vswitchd
with DPDK backend inside a QEMU virtual machine. This happens because,
by default, virtio NIC provided to the guest doesn't support multiple
TX queues which are required by ovs-vswitchd/dpdk. This commit updates
INSTALL.DPDK.md to provide guidelines on how to enable support for
multiple TX queues using QEMU command line and Libvirt config file.
Signed-off-by: Oleg Strikov <oleg.strikov@canonical.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Kevin Traynor [Tue, 12 May 2015 04:58:14 +0000 (21:58 -0700)]
netdev-dpdk: Add vhost enqueue retries.
The max allowed burst size for a single vhost enqueue is 32.
This code facilitates trying to send greater than the burst
size of packets to the vhost interface by adding a retry loop
and calling vhost enqueue multiple times. As this could
potentially block, a timeout is added.
Signed-off-by: Kevin Traynor <kevin.traynor@intel.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Kevin Traynor [Tue, 12 May 2015 04:58:12 +0000 (21:58 -0700)]
netdev-dpdk: Change phy rx burst size.
Change phy rx burst size from 192 to 32. This aligns the
burst size with the other dpdk interfaces and significantly
improves performance when forwarding to dpdk vhost ports.
Signed-off-by: Kevin Traynor <kevin.traynor@intel.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Ethan Jackson [Thu, 26 Mar 2015 19:52:42 +0000 (12:52 -0700)]
utilities: Add new pipeline generator script.
When doing OVS performance testing, it's important to have both
realistic traffic traces and OpenFlow pipelines on which to evaluate
prospective changes. As a first step in this direction, this patch
adds a python script which generates an OpenFlow pipeline intended to
simulate typical network virtualization workloads.
Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Ben Pfaff [Fri, 8 May 2015 16:15:43 +0000 (09:15 -0700)]
ofp-util: Use OFPGMFC_OUT_OF_BUCKETS for indirect groups with !=1 buckets.
OpenFlow 1.3 says:
If a switch cannot add the incoming group entry due to restrictions
(hardware or otherwise) limiting the number of group buckets, it must
refuse to add the group entry and must send an ofp_error_msg with
OFPET_GROUP_MOD_FAILED type and OFPGMFC_OUT_OF_BUCKETS code.
This indicates that OFPGMFC_OUT_OF_BUCKETS is appropriate for an indirect
group with the wrong number of buckets, but OVS was using a different
error. This fixes the problem.
ONF-JIRA: EXT-546 Reported-by: Mrinmoy Das <mrdas@ixiacom.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Justin Pettit <jpettit@nicira.com>
Pravin B Shelar [Thu, 7 May 2015 17:17:26 +0000 (10:17 -0700)]
datapath: define compat __skb_gso_segment()
OVS correctly define skb_gso_segment() to handle MPLS and VLAN
segmentation correctly. But OVS also uses __skb_gso_segment() in
some cases. Following patch defines compat __skb_gso_segment()
to handle all segmentation cases.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
dpctl: Ignore enumeration errors if there is at least one datapath.
When dpctl commands are used to inspect a userspace datapath, but OVS
has also built-in support for the kernel datapath, an error message is
reported if the kernel module is not loaded. This commit suppresses the
message.
Suggested-by: Ethan Jackson <ethan@nicira.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Joe Stringer [Wed, 6 May 2015 21:31:55 +0000 (14:31 -0700)]
lldp: Fix clang warning.
Clang-3.7 generates warnings such as the following:
../lib/ovs-lldp.c:394:19: error: address of array 'hardware->h_ifname'
will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
This value is fetched from a netdev, which as far as I can tell must
always have a non-NULL name. Simplify this code.
Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Dennis Flynn <drflynn@avaya.com> Acked-by: Ben Pfaff <blp@nicira.com>
Billy O'Mahony [Tue, 5 May 2015 16:37:31 +0000 (17:37 +0100)]
docs: Clarify creation & bonding of DPDK enabled interfaces.
Unlike system interfaces, DPDK enabled interfaces must have their interface
type explicitly set when used to create ports. Mention this in relevant parts
of the documentation and add references to INTALL.DPDK.md, where there are many
examples.
Signed-off-by: Billy O'Mahony <billy.o.mahony@intel.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
xenserver: Use kernel uname version for XenServer 6.5
In XenServer 6.5, multiple kernel packages with different
rpm versions can have the same uname. So, it is not
necessary for openvswitch kernel module to require the
exact rpm version. Instead, the kernel module package
should check the uname version.
This commit will add a new variable %{kernel_uname} to
specify whether to use kernel uname version or kernel
rpm version as requirement.
When %{kernel_name} is used, openvswitch-module will have
"Requires: kernel-uname-r = <uname version>" set instead of
"Requires: kernel = <version>".
Reported-by: Gosen Chien <astgosen@ccu.edu.tw> Signed-off-by: Edwin Chiu <echiu@vmware.com> Signed-off-by: Alex Wang <alexw@nicira.com>
Pravin B Shelar [Sat, 2 May 2015 00:30:44 +0000 (17:30 -0700)]
datapath: gre: Reset fix_segment pointer.
For kernel version 3.12 to 3.18, GRE uses compat code to
transmit packets which used fix_segment to segment packets.
but ovs_gso_cb->fix_segment is not initialized for GRE tunnels.
Following patches fixes it by resetting fix_segment.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
Pravin B Shelar [Fri, 1 May 2015 18:02:02 +0000 (11:02 -0700)]
dpctl: cleaner dpctl output for tunnel ports.
Currently dont-fragment and TTL are initialized to zero, but
those are not default config for tunnel ports. dpctl
does not show default config of a port. So by setting these
values to default we can get cleaner `dpctl show` output.
% ovs-dpctl show
system@ovs-system:
port 0: ovs-system (internal)
port 1: br0 (internal)
port 4: gre_sys (gre: df_default=false, ttl=0)
% ovs-dpctl show # After initializing default values.
system@ovs-system:
port 0: ovs-system (internal)
port 1: br0 (internal)
port 4: gre_sys (gre)
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
Mark Kavanagh [Mon, 20 Apr 2015 19:37:14 +0000 (12:37 -0700)]
DPDK: add support for v2.0.0
Update relevant artifacts to add support for DPDK v2.0.0
- INSTALL.DPDK.md
- travis build script
- acinclude.m4: add 'mssse3' flag to OVS_CFLAGS
- netdev-dpdk: fix build with unified offload types in DPDK v2.0.0
Note that this breaks compatibility with DPDK v1.8.0
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Signed-off-by: Panu Matilainen <pmatilai@redhat.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
The Stateless TCP Tunnel (STT) protocol encapsulates traffic in
IPv4/TCP packets.
STT uses TCP segmentation offload available in most of NIC. On
packet xmit STT driver appends STT header along with TCP header
to the packet. For GSO packet GSO parameters are set according
to tunnel configuration and packet is handed over to networking
stack. This allows use of segmentation offload available in NICs
The protocol is documented at
http://www.ietf.org/archive/id/draft-davie-stt-06.txt
ovs-hyperv: make kernel return values netlink socket like
In this patch, we make changes to usersapce as well as
kernel datapath on hyperv to make it more netlink socket
like. Previously, the kernel datapath did not distinguish
between "transport errors" and other errors. Netlink
semantics dictate that netlink functions should only
return an error only in the case of a "transport error"
which is generally something fatal. Eg. failure to
communicate with the OVS module, or an invalid command
altogether. Other errors such as an unsupported action,
or an invalid flow key is not considered a "transport
error", and in such cases, netlink functions are to return
success with a 'struct nlmsgerr' populated in the output
buffer.
The extension failed to be activated during booting due to the
failure to initialize tunnel filter. This happened because the Base
Filtering Engine (BFE) is not started and no session to the engine
could be acquired.
The solution for this was to registered a BFE notification callback
that is called whenever the BFE's state changes. Only if the BFE's
state is running the tunnel filter is initialized.
datapath: Fix check-export-symbol for non-bash shells
Avoid using a bash construct (=~) in the target.
An alternative would be to make the configure script require
bash explicitly. (Currently it doesn't and on NetBSD /bin/ksh
is likely used.)
The code in question was introduced by
commit b296b82a87326e68773b970284b8e012def0e3ba .
("datapath: Check the export of public functions in linux/compat/linux/.")
Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Acked-by: Alex Wang <alexw@nicira.com>
datapath: Stop using __DATE__ and __TIME__ in startup string.
An increasing number of distributions ship with GCC 4.9 (including
Fedora and Ubuntu) that has -Werror=date-time. This causes kernel
compilation to fail because the builds are not exactly reproducible.
This simply removes the use of those constants, which was already
done for the upstream Linux version of the module. It retains the
version string, however, which should provide the same information
in most cases.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Terry Wilson [Sat, 25 Apr 2015 19:57:44 +0000 (14:57 -0500)]
Allow subclasses of Idl to define a notification hook
It is useful to make the notification events that Idl processes
accessible to users of the library. This will make it possible to
keep external systems in sync, but does not impose any particular
notification pattern.
The Row.from_json() call is added to be able to convert the 'old'
JSON response on an update to a Row object to make it easy for
users of notify() to see what changed, though this usage of Row
is quite different than Idl's typical use.
Signed-off-by: Terry Wilson <twilson@redhat.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Reported-by: Kevin Lo <kevlo@FreeBSD.org> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Simon Horman <simon.horman@netronome.com> Acked-by: Kevin Lo <kevlo@FreeBSD.org>
datapath: Use kernel Geneve implementation on 4.0 and above.
When Geneve was originally backported, it wasn't available as part
of a released kernel version but it is now, so we can take advantage
of the native implementation.
Note that Geneve was actually first available as part of the 3.18
kernel release but some drivers erroreously try to offload it as
if it were VXLAN, which was fixed in the 4.0 release. Since our
UDP tunnel compat layer already takes care of this, we continue
using the OVS Geneve implementation until 4.0.
Reported-by: Alex Wang <alexw@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Reviewed-by: Simon Horman <simon.horman@netronome.com>
When the UDP tunnel compat code was written, it backported some
functions that were slated to be in the next kernel release, then
called 3.20. However, this was ultimately released as 4.0 instead.
Signed-off-by: Jesse Gross <jesse@nicira.com> Reviewed-by: Simon Horman <simon.horman@netronome.com>
Alex Wang [Mon, 20 Apr 2015 03:54:50 +0000 (20:54 -0700)]
datapath: Check the export of public functions in linux/compat/linux/.
This commit adds check in datapath/Makefile to make sure that all public
functions and exported symbols in linux/compat/ are either rpl_ or ovs_
prefixed, except those defined in compat/build-aux/export-check-whitelist.
Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
Alex Wang [Tue, 21 Apr 2015 01:19:53 +0000 (18:19 -0700)]
datapath: Prevent linker error of unknown symbol.
With the latest change of separating vports into their own modules,
it is necessary to export all public functions in linux/compat/
directory. Also, we should prefix functions which replace the
upstream ones with 'rpl_' and others with 'ovs_'. This will prevent
the linker error when vport modules use those functions in the future.
e.g., the to be merged vport-stt module will use the flex_array_*
functions which are not currently exported.
Co-authored-by: Tuan Nguyen <tuan.nguyen@veriksystems.com> Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
When linking executables on windows the following argument is passed
to the linker -Qunused-arguments.
This results in the following warning:
Command line warning D9002 : ignoring unknown option '-Qunused-arguments'
This patch removes that warning.
Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
datapath-windows: don't free switch cxt until ref == 0
This is a hard to hit corner case, because currently we recommend that
all handles to the kernel datapath be closed before trying to unload the
OVS extension.
Mark D. Gray [Mon, 13 Apr 2015 13:36:56 +0000 (06:36 -0700)]
netdev-dpdk: Reset RSS hash on transmit
When using DPDK rings (dpdkr port type), packet buffers get shared
to consumers of the rings (e.g. Virtual Machines). The packet buffers
also include the RSS hash. This is a hash of a number of fields
in the packet and is used in order to do a fast lookup in the EMC.
However, if a consumer of the packet modifies the packet without
regenerating the RSS hash, the EMC will use the same hash for lookup
even though the packet may belong to a different flow. This would
cause unnecessary collisions in the EMC reducing performance in the
presence of multiple flows.
To avoid receiving an incorrect RSS hash on reception from a DPDK
ring, the RSS hash needs to be reset on transmission. This will reduce
performance of the forwarding path as the RSS hash will need to
calculated for every packet received from an dpdkr but will behave
correctly in the presence of a large number of flows that get
modified by the consumer of a DPDK ring
Signed-off-by: Mark D. Gray <mark.d.gray@intel.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
It has been observed that sometimes Windows unit tests hang.
This happens when a daemon is started but does not get terminated
when the test ends.
In one particular case, OVS_VSWITCHD_STOP is called which inturn
calls 'ovs-appctl exit'. This causes ovs-vswitchd's atexit handler
to cleanup the pidfiles. After this, the pthread destructurs get
called and a deadlock happens in there. This results in the
daemons not getting force killed resulting in the tests hanging
because the cleanup file tries to run the command
"kill `cat ovs-vswitchd.pid`" and ovs-vswitchd.pid no longer exists.
With this commit, we write the pid value of the daemons in the
cleanup file (instead of asking it to 'cat' the value later from
the pidfile). This way, even if the pidfiles get deleted, we can
still kill the daemons.
This commit also changes the way daemons are force killed in
Windows. It was observed that 'taskkill //F ' failed to kill
a deadlocked daemon running its pthread destructor. But
tskill succeeds.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
(ON_EXIT_UNQUOTED macro provided by Ben.) Co-authored-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Thomas Graf [Wed, 22 Apr 2015 07:49:43 +0000 (09:49 +0200)]
ovs-ctl: Unload & reload vport modules on force-reload-kmod
We manually rmmod the loaded vports as using modprobe -r
only works if the modules are available through modules.dep
We do not treat failures to load vports as a fatal error in case
the vport module has been renamed. Bringing the bridge back up is
considered more important. The error is still reported though.
Ben Pfaff [Thu, 16 Apr 2015 21:38:36 +0000 (14:38 -0700)]
netdev-dummy: Fix null pointer deref in dummy_packet_conn_set_config().
This would trigger if someone tried to switch a dummy device between
active and passive connections. It's not very important because dummy
devices are only enabled during testing.
Found by LLVM scan-build.
Reported-by: Kevin Lo <kevlo@FreeBSD.org> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Ben Pfaff [Mon, 20 Apr 2015 19:11:23 +0000 (12:11 -0700)]
dpctl.at: Ignore string representation of error messages in output.
Different C libraries represent the same error code (particularly ENODEV)
differently. This caused spurious test failures on BSD. This commit
avoids the problem by ignoring the error string representations entirely.
Alex Wang [Mon, 20 Apr 2015 22:01:40 +0000 (15:01 -0700)]
in-band: Do not use manager with loopback address for in-band control.
If the manager resides on the same host as ovs, the manager target will
be the loopback address. Then, if in-band is enabled on a bridge, the
in-band module will constantly checks the connection to the manager to
make sure the manager is reachable. However, the connection checking
implementation cannot identify the route for the loopback address and
will keep issuing the following warning:
|in_band|WARN|cannot find route for controller (127.0.0.1): No such
device or address.
To fix this, this commit makes ovs not consider manager with loopback
for in-band control at all, since the manager is always reachable
on the same host.
Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
datapath-windows: Removed assert from FilterNetPnPEvent handler
I have removed an inappropriate assert from the FilterNetPnPEvent
routine, OvsExtNetPnPEvent. When NDIS calls the FilterNetPnPEvent
routine, the extension is in paused state and, obviously, the
switch is not active. The switch becomes active after FilterRestart
routine is called and the restart is successfully complete.