This patch is similar to earlier vxlan patch.
Geneve device close operation frees geneve socket. This
operation can race with geneve-xmit function which
dereferences geneve socket. Following patch uses RCU
mechanism to avoid this situation.
Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
ifnotifier: do not wake up when there is no db connection
When bridge uses the interface notifier, it wakes up until a reconfiguration
takes place. However, if there is no connection or a lock contention to the
database, the check for reconfiguration will not take place.
This uses a seq and only seq_wait when checking for the interfaces change.
This is easily reproduced by starting ovs-vswitchd without starting
ovsdb-server, and then creating a new system interface, like using
'ip link add type veth'. ovs-vswitchd will then consume 100% CPU.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Shashank Ram [Mon, 10 Oct 2016 22:15:05 +0000 (15:15 -0700)]
datapath-windows: Set isActivated flag only on success
@Switch.c: Modifies OvsActivateSwitch() function
to mark the switch as activated only if the
the status is success. The callers itself
only call this method when the isActivated
flag is unset.
Mauricio Vasquez [Fri, 21 Oct 2016 04:51:24 +0000 (23:51 -0500)]
doc: v2: fix bad link to dpdk advance installation guide
Previous fix was also wrong.
Fixes: 167703d ("doc: Convert INSTALL.DPDK to rST") Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it> Acked-by: Stephen Finucane <stephen@that.guru> Signed-off-by: Russell Bryant <russell@ovn.org>
Jarno Rajahalme [Thu, 20 Oct 2016 22:22:14 +0000 (15:22 -0700)]
datapath: Support a fixed size of 128 distinct labels.
Port upstream change in conntrack labels extension. Add a new
configure macro HAVE_NF_CONN_LABELS_WITH_WORDS to detect the old
definition. Unfortunately there is no conntrack API to hide the
difference, so the this makes conntrack.c deviate from upstream source
a bit.
netfilter: conntrack: support a fixed size of 128 distinct labels
The conntrack label extension is currently variable-sized, e.g. if
only 2 labels are used by iptables rules then the labels->bits[] array
will only contain one element.
We track size of each label storage area in the 'words' member.
But in nftables and openvswitch we always have to ask for worst-case
since we don't know what bit will be used at configuration time.
As most arches are 64bit we need to allocate 24 bytes in this case:
Make bits a fixed size and drop the words member, it simplifies
the code and only increases memory requirements on x86 when
less than 64bit labels are required.
We still only allocate the extension if its needed.
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Pravin B Shelar <pshelar@ovn.org>
Flavio Leitner [Tue, 18 Oct 2016 17:04:42 +0000 (15:04 -0200)]
fedora: do not restart the service on a pkg upgrade
There is no reliable way to restore the previous networking
state after a service restart. Many things like firewall
configuration, traffic shaping, stacked devices, custom setups
are completely out of OVS control.
The OVS might be providing the network used for remote
administration, so do not automatically restart the service
during a package upgrade.
Signed-off-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: Russell Bryant <russell@ovn.org>
Ben Pfaff [Wed, 7 Sep 2016 16:04:46 +0000 (09:04 -0700)]
ovsdb-idl: Check internal graph in OVSDB tests.
Some upcoming tests will add extra trickiness to the IDL internal graph.
This worries me, because the IDL doesn't have any checks for its graph
consistency. This commit adds some.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Ben Pfaff [Tue, 30 Aug 2016 23:58:44 +0000 (16:58 -0700)]
ovsdb-idlc: Remove special case for "sizeof bool".
The "sparse" checker used to warn about sizeof(bool). These days, it does
not warn (without -Wsizeof-bool), so remove this ugly special case.
If you have a version of "sparse" that still warns by default, please
upgrade to a version that includes commit 2667c2d4ab33 (sparse: Allow
override of sizeof(bool) warning).
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Ben Pfaff [Wed, 31 Aug 2016 18:42:53 +0000 (11:42 -0700)]
ovsdb-idl: Sort and unique-ify datum in ovsdb_idl_txn_write().
I noticed that there were lots of calls to ovsdb_datum_sort_unique() from
"set" functions in generated IDL code. This moves that call into common
code, reducing redundancy.
There are more calls to the same function that are a little harder to
remove.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Ben Pfaff [Wed, 24 Aug 2016 19:38:39 +0000 (12:38 -0700)]
ovsdb-idlc: Declare loop variables in for statements in generated code.
This changes several instances of
size_t i;
for (i = 0; i < ...; i++)
into:
for (size_t i = 0; i < ...; i++)
in generated code, making it slightly more compact and easier to read.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Ben Pfaff [Wed, 24 Aug 2016 19:32:59 +0000 (12:32 -0700)]
ovsdb-idlc: Make generated references to columns easier to read.
This replaces ovsrec_open_vswitch_columns[OVSREC_OPEN_VSWITCH_COL_CUR_CFG]
by the easier to read and equivalent ovsrec_open_vswitch_col_cur_cfg in
generated code.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Ben Pfaff [Wed, 24 Aug 2016 18:47:56 +0000 (11:47 -0700)]
ovsdb-idlc: Simplify code generation to parse sets and maps of references.
This switches from code that looks like:
if (keyRow) {
...
}
to:
if (!keyRow) {
continue;
}
...
which is a little easier to generate because the indentation of ... is
constant.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Ben Pfaff [Thu, 1 Sep 2016 05:03:27 +0000 (22:03 -0700)]
ovsdb-idlc: Use ovsdb_datum_from_smap() instead of open-coding it.
There's no reason to have three copies of this code for every smap-type
column.
The code wasn't a perfect match for ovsdb_datum_from_smap(), so this commit
also changes ovsdb_datum_from_smap() to better suit it. It only had one
caller and the new design is adequate for that caller.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Mauricio Vasquez [Tue, 18 Oct 2016 22:23:12 +0000 (17:23 -0500)]
doc: fix bad link to dpdk advance installation guide
The link was pointing to a wrong place after the file was converted to rst.
Fixes: 167703d664fc ("doc: Convert INSTALL.DPDK to rST") Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it> Acked-by: Stephen Finucane <stephen@that.guru> Signed-off-by: Russell Bryant <russell@ovn.org>
YAMAMOTO Takashi [Tue, 18 Oct 2016 11:31:55 +0000 (11:31 +0000)]
ovn-controller.at: Stop hardcoding a list of iface types
The list of supported iface types hardcoded in the test
is wrong on NetBSD. (or any userland-only ports I guess)
Instead of adding another case for NetBSD following WIN32,
just get the list from ovsdb.
Signed-off-by: YAMAMOTO Takashi <yamamoto@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
This will eventually go away once Sphinx starts doing all this work for
us. For now, however, let's make sure we don't break the OVS website.
This introduces a new dependency for the dist-docs script - 'rst2html'.
This tool is packaged on Ubuntu, Fedora (via 'python-docutils'), etc.
and can be installed from pip using the 'docsutils' package.
Signed-off-by: Stephen Finucane <stephen@that.guru> Signed-off-by: Russell Bryant <russell@ovn.org>
By reordering the data elements in dpif_upcall structure, pad bytes can
be reduced and also a cache line. Also dp_packet should be the first
member of the structure because rte_mbuf, the first member of dp_packet
should be aligned atleast on a 64-byte boundary.
Before: structure size:768, holes:1, sum padbytes:60, cachelines:12
After: structure size:704, holes:1, sum padbytes:4, cachelines:11
Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com> Co-authored-by: Antonio Fischetti <antonio.fischetti@intel.com> Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com> Acked-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
hash: Skip Invoking mhash_add__() with zero input.
mhash_add__() is expensive and should be only called with valid input.
zero-valued 'data' will not affect the 'hash' value and expensive hash
computation can be skipped when input is zero.
This patch will validate the input in mhash_add__ to save some cpu
cycles.
Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com> Co-authored-by: Antonio Fischetti <antonio.fischetti@intel.com> Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Ben Pfaff [Fri, 14 Oct 2016 18:12:43 +0000 (11:12 -0700)]
stream-ssl: Fix memory leak on error path.
The commit that this fixes is from 2009.
Reported-by: Kai-Wei Fan <fank@vmware.com> Fixes: 9467fe624698 ("Add SSL support to "stream" library and OVSDB.") Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Amitabha Biswas [Wed, 12 Oct 2016 21:36:57 +0000 (14:36 -0700)]
Python-IDL: getattr after mutate fix
This commit returns the updated column value when getattr is done
after a mutate operation is performed (but before the commit).
Signed-off-by: Amitabha Biswas <azbiswas@gmail.com> Reported-by: Richard Theis <rtheis@us.ibm.com>
Reported-at: http://openvswitch.org/pipermail/dev/2016-September/080120.html Fixes: a59912a0ee8e ("python: Add support for partial map and set updates") Signed-off-by: Russell Bryant <russell@ovn.org>
Aaron Conole [Fri, 7 Oct 2016 17:36:45 +0000 (13:36 -0400)]
rhel-systemd: Delay shutting down the services
During testing it was found that systemd would consider the openvswitch
service as a part of networking component, but the dependent services of
ovs-vswitchd and ovsdb-server were not likewise considered. This leads
to some strange race conditions, observed when using NFS over TCP, while
shutting down systems.
Mark Kavanagh [Thu, 6 Oct 2016 10:25:33 +0000 (11:25 +0100)]
doc: Update DPDK pdump documentation
The DPDK pdump sample app was renamed from 'dpdk_pdump' to
'dpdk-pdump'. Update references to same within
INSTALL.DPDK-ADVANCED.md.
Add an additional sample command line that shows how to capture all
traffic traversing an interface within a single pcap file - a useful
tool for debugging DPDK-related issues.
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Ciara Loftus [Thu, 13 Oct 2016 17:27:51 +0000 (18:27 +0100)]
dpdk: Fix DPDK pdump compilation
The rte_pdump header file was not included in the file that requires it.
Fix this.
Fixes: 01961bbdd34a ("dpdk: New module with some code from netdev-dpdk.") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Since DPDK commit 30e639989227("mempool: support non-EAL thread"),
non-EAL threads can use the mempool API safely. Plus, nonpmd threads
access to netdev is already serialized with 'non_pmd_mutex' in
dpif-netdev.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ben Pfaff <blp@ovn.org> Acked-by: Aaron Conole <aconole@redhat.com> Tested-by: Aaron Conole <aconole@redhat.com>
netdev-dpdk: Do not abort if out of hugepage memory.
We can run out of hugepage memory coming from rte_*alloc() more easily
than heap coming from malloc().
Therefore:
* We should not use hugepage memory if we're going to access it only in
the slowpath.
* We shouldn't abort if we're out of hugepage memory.
* We should gracefully handle out of memory conditions.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ben Pfaff <blp@ovn.org> Acked-by: Aaron Conole <aconole@redhat.com> Tested-by: Aaron Conole <aconole@redhat.com>
I think it's clearer to use RCU than to check for a pointer twice in the
fast path (before and after taking the spinlock). Now the spinlock is
integrated into 'qos_conf'.
'qos_conf' objects cannot be modified, so, instead of having
'qos_set()', we now have 'qos_is_equal()', which tells us if an object
must be destroyed and recreated.
With this patch we also avoid passing the netdev parameter to qos ops,
since it was unused most of the times.
Lastly, some duplication is removed.
CC: Ian Stokes <ian.stokes@intel.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ben Pfaff <blp@ovn.org> Acked-by: Aaron Conole <aconole@redhat.com> Tested-by: Aaron Conole <aconole@redhat.com>
Looks like we forgot to add the copyright headers to netdev-dpdk.h.
Looking at the contribution history of the file, this commit adds the
header with Red Hat copyright.
Looks like we forgot to add the copyright headers to netdev-dpdk.h.
Looking at the contribution history of the file, this commit adds the
header with Nicira copyright.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ben Pfaff <blp@ovn.org> Acked-by: Aaron Conole <aconole@redhat.com> Tested-by: Aaron Conole <aconole@redhat.com>
dp_netdev_get_pmd() is allowed to return NULL (even if we call it with
NON_PMD_CORE_ID) for different reasons:
* Since we use RCU to protect pmd threads, it is possible that
ovs_refcount_try_ref_rcu() has failed.
* During reconfiguration we destroy every thread.
This commit makes sure that we always handle the case when
dp_netdev_get_pmd() returns NULL without crashing (the change in
dpif_netdev_run() doesn't fix anything, because everything is happening
in the main thread, but it's better to honor the interface in case we
change our threading model).
This actually fixes a pretty serious crash that happens if
dpif_netdev_execute() is called from a non pmd thread while
reconfiguration is happening. It can be triggered by enabling bfd
(because it's handled by the monitor thread, which is a non pmd thread)
on an interface and changing something that requires datapath
reconfiguration (n_rxq, pmd-cpu-mask, mtu).
A testcase that reproduces the race condition is included.
This is a possible backtrace of the segfault:
#0 0x000000000060c7f1 in dp_execute_cb (aux_=0x7f1dd2d2a320,
packets_=0x7f1dd2d2a370, a=0x7f1dd2d2a658, may_steal=false) at
../lib/dpif-netdev.c:4357
#1 0x00000000006448b2 in odp_execute_actions (dp=0x7f1dd2d2a320,
batch=0x7f1dd2d2a370, steal=false, actions=0x7f1dd2d2a658,
actions_len=8,
dp_execute_action=0x60c7a5 <dp_execute_cb>) at
../lib/odp-execute.c:538
#2 0x000000000060d00c in dp_netdev_execute_actions (pmd=0x0,
packets=0x7f1dd2d2a370, may_steal=false, flow=0x7f1dd2d2ae70,
actions=0x7f1dd2d2a658, actions_len=8,
now=44965873) at ../lib/dpif-netdev.c:4577
#3 0x000000000060834a in dpif_netdev_execute (dpif=0x2b67b70,
execute=0x7f1dd2d2a578) at ../lib/dpif-netdev.c:2624
#4 0x0000000000608441 in dpif_netdev_operate (dpif=0x2b67b70,
ops=0x7f1dd2d2a5c8, n_ops=1) at ../lib/dpif-netdev.c:2654
#5 0x0000000000610a30 in dpif_operate (dpif=0x2b67b70,
ops=0x7f1dd2d2a5c8, n_ops=1) at ../lib/dpif.c:1268
#6 0x000000000061098c in dpif_execute (dpif=0x2b67b70,
execute=0x7f1dd2d2aa50) at ../lib/dpif.c:1233
#7 0x00000000005b9008 in ofproto_dpif_execute_actions__
(ofproto=0x2b69360, version=18446744073709551614, flow=0x7f1dd2d2ae70,
rule=0x0, ofpacts=0x7f1dd2d2b100,
ofpacts_len=16, indentation=0, depth=0, resubmits=0,
packet=0x7f1dd2d2b5c0) at ../ofproto/ofproto-dpif.c:3806
#8 0x00000000005b907a in ofproto_dpif_execute_actions
(ofproto=0x2b69360, version=18446744073709551614, flow=0x7f1dd2d2ae70,
rule=0x0, ofpacts=0x7f1dd2d2b100,
ofpacts_len=16, packet=0x7f1dd2d2b5c0) at
../ofproto/ofproto-dpif.c:3823
#9 0x00000000005dea9b in xlate_send_packet (ofport=0x2b98380,
oam=false, packet=0x7f1dd2d2b5c0) at
../ofproto/ofproto-dpif-xlate.c:5792
#10 0x00000000005bab12 in ofproto_dpif_send_packet (ofport=0x2b98380,
oam=false, packet=0x7f1dd2d2b5c0) at ../ofproto/ofproto-dpif.c:4628
#11 0x00000000005c3fc8 in monitor_mport_run (mport=0x2b8cd00,
packet=0x7f1dd2d2b5c0) at ../ofproto/ofproto-dpif-monitor.c:287
#12 0x00000000005c3d9b in monitor_run () at
../ofproto/ofproto-dpif-monitor.c:227
#13 0x00000000005c3cab in monitor_main (args=0x0) at
../ofproto/ofproto-dpif-monitor.c:189
#14 0x00000000006a183a in ovsthread_wrapper (aux_=0x2b8afd0) at
../lib/ovs-thread.c:342
#15 0x00007f1dd75eb444 in start_thread (arg=0x7f1dd2d2c700) at
pthread_create.c:333
#16 0x00007f1dd6e1d20d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Thu, 6 Oct 2016 03:07:56 +0000 (20:07 -0700)]
tests: Get rid of overly specific --pidfile and --unixctl options.
At an early point in OVS development, OVS was built with fixed default
directories for pidfiles and sockets. This meant that it was necessary to
use lots of --pidfile and --unixctl options in the testsuite, to point the
daemons to where they should put these files (since the testsuite cannot
and generally should not touch the real system /var/run). Later on,
the environment variables OVS_RUNDIR, OVS_LOGDIR, etc. were introduced
to override these defaults, and even later the testsuite was changed to
always set these variables correctly in every test. Thus, these days it
isn't usually necessary to specify a filename on --pidfile or to specify
--unixctl at all. However, many of the tests are built by cut-and-paste,
so they tended to keep appearing anyhow. This commit drops most of them,
making the testsuite easier to read and understand.
This commit also sweeps away some other historical detritus. In
particular, in early days of the testsuite there was no way to
automatically kill daemons when a test failed (or otherwise ended). This
meant that some tests were littered with calls to "kill `cat pidfile`" on
almost every line (or m4 macros that expanded to the same thing) so that if
a test failed partway through the testsuite would not hang waiting for a
daemon to die that was never going to die without manual intervention.
However, a long time ago we introduced the "on_exit" mechanism that
obsoletes this. This commit eliminates a lot of the old litter of kill
invocations, which also makes those tests easier to read.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Ben Pfaff [Wed, 12 Oct 2016 17:40:53 +0000 (10:40 -0700)]
tests: Fix double-rebuild of testsuite for "check-valgrind" and similar.
When I ran "make check-valgrind -j10" and the testsuite needed to be
rebuilt, two copies of it were rebuilt in parallel and sometimes they
raced with each other. I don't have the full story on exactly why this
happened, but this commit, which eliminates redundant dependencies from
check-* targets, fixes the problem for me. The dependencies are redundant
because these targets depend on "all", which also depends on them.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Ben Pfaff [Thu, 6 Oct 2016 22:31:07 +0000 (15:31 -0700)]
expr: Better simplify some special cases of expressions.
It's pretty unlikely that a human would write expressions like these, but
they can come up in machine-generated expressions and it seems best to
simplify them in an efficient way.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Ben Pfaff [Fri, 7 Oct 2016 01:08:30 +0000 (18:08 -0700)]
expr: Fix abort when simplifying "x != 0/0".
The test added by this commit is very specific to the particular problem,
whereas a more general test would be better. A later commit adds the
general test.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Ben Pfaff [Fri, 7 Oct 2016 00:54:19 +0000 (17:54 -0700)]
expr: Simplify "x == 0/0" into 1.
An expression like "x == 0/0" does not test any actual bits in field x,
so it resolves to true, but expr_simplify() was not smart enough to see
this.
This goes beyond an optimization, to become a bug fix, because
expr_normalize() will assert-fail for expressions that become trivial
when this simplification is omitted. For example:
The test added by this commit is very specific to the particular problem,
whereas a more general test would be better. A later commit adds the
general test.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>