upcall: Don't start new revalidation round too soon after the last one.
The execution time of 'ovs-ofctl add-flows' with a large number of
flows can be more than halved if revalidators are not running after
each flow mod separately. This was first suspected when it was found
that 'ovs-ofctl --bundle add-flows' is about 10 times faster than the
same command without the '--bundle' option in a scenario where there
is a large set of flows being added and no datapath flows at all. One
of the differences caused by the '--bundle' option is that the
revalidators are woken up only once, at the end of the whole set of
flow table changes, rather than after each flow table change
individually.
This patch limits the revalidation to run at most 200 times a second
by enforcing a minimum of 5ms time gap between the start times of
revalidation rounds. If nothing happens in, say 6 milliseconds, and
then a new flow table change is signaled, the revalidator threads wake
up immediately without any further delay. Values smaller than 5 were
found to increase the 'ovs-ofctl add-flows' execution time noticeably.
Since the revalidators are not running after each flow mod, the
overall OVS CPU utilization during the 'ovs-ofctl add-flows' run time
is reduced roughly by one core on a four core machine.
In testing the 'ovs-ofctl add-flows' execution time is not
significantly improved from this even if the revalidators are not
notified about the flow table changes at all.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
David Hill [Tue, 30 Aug 2016 19:13:31 +0000 (15:13 -0400)]
netdev-linux: Use ethtool when miimon fails.
Some network drivers might return true to SIOCGMIIPHY and an error on
SIOCGMIIREG when using MII to query phy state. Fall back to ethtool if this
happens to allow failover to work when using such nics.
Reported-at: http://openvswitch.org/pipermail/dev/2016-August/078800.html Signed-off-by: David Hill <dhill@redhat.com> Signed-off-by: Joe Stringer <joe@ovn.org>
OVS GRE IPsec tunnel support has multiple issues, Therefore
it was deprecated in OVS 2.6.
Following patch removes support for GRE IPsec and allows external
IPsec tunnel management for any type of tunnel not just GRE.
e.g. user can encrypt Geneve or VxLan traffic.
It can be done by using openflow pipeline to set skb-mark
and using IPsec keying daemons to implement IPsec tunnels.
This packet can be matched for the skb-mark to encrypt
selective tunnel traffic.
Andy Zhou [Tue, 20 Sep 2016 19:44:32 +0000 (12:44 -0700)]
ovsdb: Fix segfalut during replication.
The newly added replication logic makes it possible for a monitor to
receive delete and insertion of the same row back to back, which
was not possible before. Add logic (and comment) to handle this
case to avoid follow crash reported by Valgrind:
#0 0x0000000000453edd in ovsdb_datum_compare_3way
(a=0x5efbe60, b=0x0, type=0x5e6a848) at lib/ovsdb-data.c:1626
#1 0x0000000000453ea4 in ovsdb_datum_equals
(a=0x5efbe60, b=0x0, type=0x5e6a848) at lib/ovsdb-data.c:1616
#2 0x000000000041b651 in update_monitor_row_data
(mt=0x5eda4a0, row=0x5efbe00, data=0x0) at ovsdb/monitor.c:310
#3 0x000000000041ed14 in ovsdb_monitor_changes_update
(old=0x0, new=0x5efbe00, mt=0x5eda4a0, changes=0x5ef7180)
at ovsdb/monitor.c:1255
#4 0x000000000041f12e in ovsdb_monitor_change_cb
(old=0x0, new=0x5efbe00, changed=0x5efc218, aux_=0xffefff040)
at ovsdb/monitor.c:1339
#5 0x000000000042ded9 in ovsdb_txn_for_each_change
(txn=0x5efbd90, cb=0x41ef50 <ovsdb_monitor_change_cb>,
aux=0xffefff040) at ovsdb/transaction.c:906
#6 0x0000000000420155 in ovsdb_monitor_commit
(replica=0x5eda2c0, txn=0x5efbd90, durable=false)
at ovsdb/monitor.c:1553
#7 0x000000000042dc04 in ovsdb_txn_commit_
(txn=0x5efbd90, durable=false) at ovsdb/transaction.c:868
#8 0x000000000042ddd4 in ovsdb_txn_commit (txn=0x5efbd90, durable=false)
at ovsdb/transaction.c:893
#9 0x0000000000422e0c in process_notification
(table_updates=0x5efad10, db=0x5e6bd40) at ovsdb/replication.c:575
#10 0x0000000000420ff3 in replication_run () at ovsdb/replication.c:184
#11 0x0000000000405cc8 in main_loop
(jsonrpc=0x5e67770, all_dbs=0xffefff3a0, unixctl=0x5ebd980,
remotes=0xffefff360, run_process=0x0, exiting=0xffefff3c0,
is_backup=0xffefff2de) at ovsdb/ovsdb-server.c:198
#12 0x0000000000406edb in main (argc=1, argv=0xffefff550)
at ovsdb/ovsdb-server.c:429
Reported-by: Joe Stringer <joe@ovn.org>
Reported-at: http://openvswitch.org/pipermail/dev/2016-September/079315.html Reported-by: Alin Serdean <aserdean@cloudbasesolutions.com>
Reported-at: http://openvswitch.org/pipermail/dev/2016-September/079586.html Co-authored-by: Joe Stringer <joe@ovn.org> Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Fixes test failure seen due to the IPsec tunnel deprecation
messages in test logs.
Fixes: 9e9d0384910e ("openvswitch: deprecates support for IPsec tunnel port."). Reported-by: Joe Stringer <joe@ovn.org> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
openvswitch: deprecates support for IPsec tunnel port.
OVS IPsec tunnel support has issues:
1. It only works for GRE.
2. only works on Debian.
3. It does not allow user to match on packet-mark
on packet received on tunnel ports.
This patch deprecates support for IPsec tunnel port.
Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Ansis Atteka <aatteka@ovn.org>
ovn-controller: Store conntrack zone mappings to OVS database.
If ovn-controller is restarted, it may choose different conntrack zones
than had been previously used, which could cause the wrong conntrack
entries to be associated with a logical port. This commit stores in the
integration bridge's OVS "Bridge" table the mapping to the conntrack zone.
Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Joe Stringer [Fri, 23 Sep 2016 00:25:46 +0000 (17:25 -0700)]
ovs-lib: Fix SELinux contexts for created dirs.
ovs-lib creates several directories directly from the script, but
doesn't make any attempt to ensure that the correct SELinux context is
applied to these directories. As a result, the created directories end
up with type var_run_t rather than openvswitch_var_run_t.
During reboot using a tmpfs for /var/run, startup scripts will invoke
ovs-lib to create these directories with the wrong context. If SELinux
is enabled, OVS will fail to start as it cannot write to this directory.
Fix the issue by sprinkling "restorecon" in each of the places where
directories are created. In practice, many of these should otherwise be
handled by packaging scripts but if they exist then we should ensure the
correct SELinux context is set.
On systems where 'restorecon' is unavailable, this should be a no-op.
IPv4 and IPv6 packets have separate flows and should not overlap with a
catch-all flow that treats all packets like IPv4. It's unpredictable what
flow actually gets chosen in this situation.
Found by inspection.
Fixes: c34a87b6c570 ("ovn: Add support for IPv6 dynamic bindings.") Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Justin Pettit <jpettit@ovn.org>
ofproto-dpif-xlate: Adjust generated mask for fragments.
It's possible to install an OpenFlow flow that matches on udp source and
destination ports without matching on fragments. If the subtable where
such flow stays is visited during translation of a later fragment, the
generated mask will have incorrect prerequisited for the datapath and it
would be revalidated away at the first chance.
This commit fixes it by adjusting the mask for later fragments after
translation.
Other prerequisites of the mask are also prerequisites in OpenFlow, but
not the ip fragment bit, that's why we need a special case here.
For completeness, this commits also fixes a related problem in bfd,
where we check the udp destination port without checking if the frame is
an ip fragment. It's not really necessary to address this separately,
given the adjustment that we perform.
ovn-northd uses ct_label[0] to keep track of the ACL changes on
existing connections.This patch replaces the usage of ct_label[0]
in the logical flows with a symbolic name ct_label.blocked
ofproto: Do not signal revalidation for group mods twice.
The new group mod implementation signals revalidation through
'->set_tables_version()', so the separate '->group_modify()' is no
longer needed. The ofproto-provider API is changed to allow
'group_modify' to be NULL.
Fixes: 5d08a275cd ("ofproto: Make groups versioned.") Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
ovn-controller: Fix possible null pointer dereference.
The code dereferences "chassis", which could be null if chassis_run()
returns null. "chassis" will always be null if "chassis_id" is null, so
checking "chassis" is sufficient to check both.
datapath: avoid deferred execution of recirc actions
Port upstream fix to datapath module. The only notable difference
between this patch and the upstream version is that the value of
ovs_recursion_limit (5 for upstream kernel, 4 for out-of-tree
module) is maintained in this patch.
openvswitch: avoid deferred execution of recirc actions
The ovs kernel data path currently defers the execution of all
recirc actions until stack utilization is at a minimum.
This is too limiting for some packet forwarding scenarios due to
the small size of the deferred action FIFO (10 entries). For
example, broadcast traffic sent out more than 10 ports with
recirculation results in packet drops when the deferred action
FIFO becomes full, as reported here:
Since the current recursion depth is available (it is already tracked
by the exec_actions_level pcpu variable), we can use it to determine
whether to execute recirculation actions immediately (safe when
recursion depth is low) or defer execution until more stack space is
available.
With this change, the deferred action fifo size becomes a non-issue
for currently failing scenarios because it is no longer used when
there are three or fewer recursions through ovs_execute_actions().
Ryan Moats [Tue, 20 Sep 2016 15:35:46 +0000 (10:35 -0500)]
ofproto-dpif-xlate: Fix memory leak in execute_controller_action.
commit df70a7731 ("ofproto-dpif-xlate: Allow translating
without side-effects.") created a memory leak by removing the
dp_packet_delete statement in execute_controller_action that
freed the earlier cloned packet. This commit restores this
statement to the end of the method.
Fixes: df70a7731 ("ofproto-dpif-xlate: Allow translating without side-effects.") Signed-off-by: Ryan Moats <rmoats@us.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
We start the communication between VM1 and VM2, for example, ICMP.
At the meantime, disconnect OVS and SDN controller, and reconnect
them again, the process ovs-vswitchd crashes.
backtrace:
0 0x00007f658082ffe4 in cls_rule_make_invisible_in_version ()
1 0x00007f65807f6bb3 in delete_flows_start__ ()
2 0x00007f65807f7ee9 in ofproto_group_mod_start ()
3 0x00007f65807fa07b in handle_openflow ()
4 0x00007f658082119b in connmgr_run ()
5 0x00007f65807f3ba6 in ofproto_run ()
6 0x00007f65807e101c in bridge_run__ ()
7 0x00007f65807e715d in bridge_run ()
8 0x00007f658065784d in main ()
Signed-off-by: Binbin Xu <xu.binbin1@zte.com.cn> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ciara Loftus [Fri, 19 Aug 2016 09:22:30 +0000 (10:22 +0100)]
netdev-dpdk: Add new 'dpdkvhostuserclient' port type
The 'dpdkvhostuser' port type no longer supports both server and client
mode. Instead, 'dpdkvhostuser' ports are always 'server' mode and
'dpdkvhostuserclient' ports are always 'client' mode.
Suggested-by: Daniele Di Proietto <diproiettod@vmware.com> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
tun-metadata: Manage tunnel TLV mapping table on a per-bridge basis.
When using tunnel TLVs (at the moment, this means Geneve options), a
controller must first map the class and type onto an appropriate OXM
field so that it can be used in OVS flow operations. This table is
managed using OpenFlow extensions.
The original code that added support for TLVs made the mapping table
global as a simplification. However, this is not really logically
correct as the OpenFlow management commands are operating on a per-bridge
basis. This removes the original limitation to make the table per-bridge.
One nice result of this change is that it is generally clearer whether
the tunnel metadata is in datapath or OpenFlow format. Rather than
allowing ad-hoc format changes and trying to handle both formats in the
tunnel metadata functions, the format is more clearly separated by function.
Datapaths (both kernel and userspace) use datapath format and it is not
changed during the upcall process. At the beginning of action translation,
tunnel metadata is converted to OpenFlow format and flows and wildcards
are translated back at the end of the process.
As an additional benefit, this change improves performance in some flow
setup situations by keeping the tunnel metadata in the original packet
format in more cases. This helps when copies need to be made as the amount
of data touched is only what is present in the packet rather than the
maximum amount of metadata supported.
datapath: backport: openvswitch: use alias for genetlink family names
Upstream commit:
commit ed227099dac95128e2aecd62af51bb9d922e5977
Author: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Date: Fri Sep 9 17:42:30 2016 -0300
openvswitch: use alias for genetlink family names
When userspace tries to create datapaths and the module is not loaded,
it will simply fail. With this patch, the module will be automatically
loaded.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Signed-off-by: Jesse Gross <jesse@kernel.org>
Ben Pfaff [Thu, 15 Sep 2016 03:59:10 +0000 (20:59 -0700)]
ofp-parse: Fix sparse warnings about comparing ofp_port_ts.
Without this, sparse complains:
lib/ofp-parse.c:588:19: warning: restricted ofp_port_t degrades to integer
lib/ofp-parse.c:588:31: warning: restricted ofp_port_t degrades to integer
This is one of the irritating bits of using sparse, but on the whole I
think it saves us pretty often.
CC: Jarno Rajahalme <jarno@ovn.org> Fixes: 6dd3c787f591 ("ofproto: Support packet_outs in bundles.") Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
Ben Pfaff [Thu, 15 Sep 2016 18:43:46 +0000 (11:43 -0700)]
ofproto-dpif-xlate: Fix treatment of mirrors across patch port.
When the bridges on both sides of a patch port included mirrors, the
translation code incorrectly conflated them instead of treating them as
independent.
Add a new select group selection method "dp_hash", which uses minimal
number of bits from the datapath calculated packet hash to inform the
select group bucket selection. This makes the datapath flows more
generic resulting in less upcalls to userspace, but adds recirculation
prior to group selection.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
ofp-parse: Remove double uninit of group mod if parsing fails.
Double ofputil_uninit_group_mod() used to be harmless, but leads to
double free after commit e8dba7197, which will crash if any error in
group parsing happens.
Add a test to prevent this regression from happening again.
datapath: compat: tunnels: Log error during initialization.
At present OVS compat tunneling can fail due to conflict with
already loaded tunneling kernel module. In this case openvswitch
kernel module loading fails silently. Following patch give more
clues about what went wrong.
Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
Patch b0d38b2f17 unified flow mod reporting in ofproto for both
stand-alone flow mods and bundle flow mods, but left bundle-specific
reporting to the bundle removal code. This patch fixes this by
removing the bundle-specific reporting of flow mods.
Found by inspection.
Fixes: b0d38b2f17 ("ofproto: Report flow mods also from bundles.") Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Thu, 15 Sep 2016 03:39:03 +0000 (20:39 -0700)]
socket-util-unix: Avoid buffer read overrun in get_unix_name_len().
If the socket length does not include any of the bytes of the path, then
the code should not read even the first byte of the path.
Found by valgrind.
Reported-by: Joe Stringer <joe@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Acked-by: Joe Stringer <joe@ovn.org>
After adding log messages to better understand IPAM-related code
in ovn northd, the IPAM tests began to fail occasionally. Adding
--wait=sb to commands triggering address allocation eliminated
these failures (there were no failures with 100 executions when
testing with this change).
Signed-off-by: Lance Richardson <lrichard@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Add support for OFPT_PACKET_OUT messages in bundles.
While ovs-ofctl already has a packet-out command, we did not have a
string parser for it, as the parsing was done directly from command
line arguments.
This patch adds the string parser for packet-out messages, adds
support for it into the 'ovs-ofctl packet-out' command, and adds a new
ofctl/packet-out ovs-appctl command that can be used when ovs-ofctl is
used as a flow monitor. The old 'ovs-ofctl packet-out syntax is
deprecated' and will be removed in a later OVS release.
The new packet-out parser is further supported with the ovs-ofctl
bundle command, which allows bundles to mix flow mods, group mods and
packet-out messages. Also the packet-outs in bundles are only
executed if the whole bundle is successful. A failing packet-out
translation may also make the whole bundle to fail.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Refactor handle_packet_out() to prepare for bundle support for packet
outs in a later patch.
Two new callbacks are introduced in ofproto-provider class:
->packet_xlate() and ->packet_execute(). ->packet_xlate() translates
the packet using the flow and actions provided by the caller, but
defers all OpenFlow-visible side-effects (stats, learn actions, actual
packet output, etc.) to be explicitly executed with the
->packet_execute() call.
Adds a new ofproto_rule_reduce_timeouts__() that must be called with
'ofproto_mutex' held. This is used in the next patch.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
ofproto-dpif-xlate: Allow translating without side-effects.
Extend 'may_learn' attribute to also control the treatment of
FIN_TIMEOUT action and asynchronous messages (packet ins,
continuations), so that when 'may_learn' is 'false' and
'resubmit_stats' is 'NULL', no OpenFlow-visible side effects are
generated by the translation.
Correspondingly, add support for one-time asynchronous messages to
xlate cache, so that all side-effects of the translation may be
executed at a later stage. This will be useful for bundle commits.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
ofproto: Use ofproto_flow_mod for learn execution from xlate cache.
Use ofproto_flow_mod with a reference to an existing or new rule
instead of ofputil_flow_mod for learn action execution from xlate
cache
Typically we would find that when a learn xlate cache entry is
created, a preceding upcall has already created the learned flow. In
this case the xlate cache entry takes a reference to that flow and
keeps refreshing it without needing to perform any flow table lookups.
Otherwise the creation of the xlate cache entry creates the new rule,
which is then subsequently added to the classifier. In both cases
this is both faster and shrinks the memory cost of each learn cache
entry from ~3.5kb to about 0.3kb.
If the learned rule does not yet exist, it is created and attached to
the ofproto_flow_mod, from which it is then added. If the referred
rule happens to expire, or is modified in any way and is thus removed
from the classifier tables, we create a new rule using the old rule as
a template, so that we can avoid storing the ofputil_flow_mod in all
cases.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
ofproto-dpif-xlate: Add xlate cache type XC_TABLE.
Xlate cache entry type XC_TABLE is required for the table stats
(number of misses and matches) to be correctly attributed.
It appears that table stats have been off ever since xlate cache was
introduced. This was now revealed by a PACKET_OUT unit test case in a
later patch that checks for table stats explicitly.
Later patches will need to create xlate cache entries from different
modules. This patch refactors the xlate cache code in preparation
without any functional changes, so that the changes are clearly
visible in the following patches.
The definition of XC_ENTRY_FOR_EACH() iterator macro is changed so
that it now does not take the xlate cache pointer to unify the usage
accross all call sites.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Make mac table update functions part of the mac-learning module, which
also helps in figuring what is the minimal set of struct flow fields
needed for the update. Use this to change the xlate cache entry for
XC_NORMAL to not take a copy of the struct flow, but only save the
in_port, dl_src, and some auxiliary fields. This reduces the memory
burden of XC_NORMAL by roughly 0.5kb.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Zongkai LI [Fri, 9 Sep 2016 06:39:17 +0000 (06:39 +0000)]
ovn-northd: add dhcpv6 stateless option support
This patch adds DHCPv6 stateless option support, to allow ovn native dhcpv6
work in stateless mode.
User can add new option dhcpv6_stateless with string value true in
DHCP_Options.options column, to let ovn dhcpv6 only reply other configurations
for DHCPv6 request messages come from VM/VIF ports, and let VM/VIF ports get
their IPv6 addresses configured via stateless way.
Signed-off-by: Zongkai LI <zealokii@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Add a few messages at INFO to help debug the vif lifecycle.
A logsearch on mac or ip helps debug what happened to the
vif and when. This helps easily correlate logs across CMS and ovn.
Logs appear like this:
2016-09-01T18:15:48Z|00014|binding|INFO|Claiming lport eee1a9af-7513-4540-9385-9e3972bfca05 for this chassis.
2016-09-01T18:15:48Z|00015|binding|INFO|Claiming fa:16:3e:01:c3:4a 10.0.0.7 fd93:b509:aa46:0:f816:3eff:fe01:c34a
2016-09-01T18:15:59Z|00016|pinctrl|INFO|DHCPOFFER fa:16:3e:01:c3:4a 10.0.0.7
2016-09-01T18:15:59Z|00017|pinctrl|INFO|DHCPACK fa:16:3e:01:c3:4a 10.0.0.7
2016-09-01T18:16:22Z|00018|binding|INFO|Releasing lport eee1a9af-7513-4540-9385-9e3972bfca05 from this chassis.
2016-09-01T18:16:22Z|00019|binding|INFO|Releasing fa:16:3e:01:c3:4a 10.0.0.7 fd93:b509:aa46:0:f816:3eff:fe01:c34a
Signed-off-by: Ramu Ramamurthy <ramu.ramamurthy@us.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ramu Ramamurthy [Tue, 30 Aug 2016 23:58:34 +0000 (23:58 +0000)]
ovn: add lsp-deletion and bcast-flow removal tests for localnet
Add 2 tests for scenarios around lsp-deletion and flow removal
which have escaped current unit tests.
This test depends on the following patch:
"ovn-controller: Back out incremental processing" and passes
after applying it, but fails currently on master.
1) In the following sequence of events,
createi&bind vif1, create&bind vif2, delete vif1
we find that the localnet patch port
got deleted, whereas it should exist because there is a
bound vif2.
2) The flow broadcasting to tunnels in table=32 must be deleted
when a localnet port gets bound, but we find that the flow remains
in table 32 causing broadcasts to both tunnels and localnet patch.
Signed-off-by: Ramu Ramamurthy <ramu.ramamurthy@us.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
lib: Create $(sysconfdir)/openvswitch upon install
In cases where dbdir and etcdir are not the same, there is a need
for creating etcdir (i.e. $(sysconfdir)/openvswitch) explicitly.
Note that there is no attempt being made here to make the etcdir
configurable as in "--with-dbdir".
Reported-at: http://openvswitch.org/pipermail/dev/2016-September/TBD.html Fixes: f973f2af2fd4 ("Make the location of the database separately configurable.") Signed-off-by: Flavio Fernandes <flavio@flaviof.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Windows: Extend support for binaries which allow detach
On Windows we require service_start to be called to parse and setup
requirements for '--detach' argument.
Affected binaries: ovn-trace, ovsdb-client, ovs-testcontroller.
Subsequent patches will be sent to adapt the tests with the new features.
Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Gurucharan Shetty <guru@ovn.org>
connmgr: Make connmgr_wants_packet_in_on_miss() lock-free.
Make connmgr_wants_packet_in_on_miss() use an atomic int instead of a
list traversal taking the 'ofproto_mutex'. This allows
connmgr_wants_packet_in_on_miss() to be called also when
'ofproto_mutex' is already held, and makes it faster, too.
ofproto: Change rule's 'removed' member to a tri-state 'state'.
As a rule may not be re-inserted to ofproto data structures, it is
cleaner to have three states for the rule, rather than just two. This
will be useful for managing learned flows in later patches.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
ofproto: Add a fixed bundle idle timeout of 10 seconds.
Timing out idle bundles frees memory that would effectively be leaked
if a long standing OpenFlow connection would fail to commit or discard
a bundle.
OpenFlow specification mandates the timeout to be at least one second,
if the switch implements such a timeout. This patch makes the bundle
idle timeout to be 10 seconds.
We do not limit the number of messages in a bundle, so it does not
make sense to limit the number of bundles either, especially now that
idle bundles are timed out.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Add "bundle" keyword to ofp-print.at tests about bundle messages.
Add a missing ofp-print.at test for bundle group mods.
Remove "monitor" keyword from ofproto.at tests that do not use a monitor.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Fix the legal notice section in OVSEXT.SYS properties. Update the MSI to
include the properties mentioned in MSDN - 'Extension driver MSI packaging
requirements' section -
https://msdn.microsoft.com/windows/hardware/drivers/network/extension-driver-msi-packaging-requirements
Ben Pfaff [Sun, 11 Sep 2016 04:23:22 +0000 (21:23 -0700)]
replication: Be more careful about JSON parsing and simplify code.
The code here wasn't careful about parsing JSON received from the remote
OVSDB server. It assumed, for example, that a row that the remote server
implied was new was actually new, without looking to see whether there was
already a row with that UUID. This commit improves this validation. It
also rewrites code that translated updates locally into calls into the
query engine, via JSON, into simple lookups by UUID.
For me, this fixes a test failure in test 1866
(ovsdb-server/active-backup-role-switching), which caused the following
valgrind report:
==18725== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==18725== Access not within mapped region at address 0x0
==18725== at 0x43937E: ovsdb_datum_compare_3way (ovsdb-data.c:1626)
==18725== by 0x439344: ovsdb_datum_equals (ovsdb-data.c:1616)
==18725== by 0x4166CC: update_monitor_row_data (monitor.c:310)
==18725== by 0x414A90: ovsdb_monitor_changes_update (monitor.c:1255)
==18725== by 0x417009: ovsdb_monitor_change_cb (monitor.c:1339)
==18725== by 0x41DB52: ovsdb_txn_for_each_change (transaction.c:906)
==18725== by 0x416CC9: ovsdb_monitor_commit (monitor.c:1553)
==18725== by 0x41D993: ovsdb_txn_commit_ (transaction.c:868)
==18725== by 0x41D6F5: ovsdb_txn_commit (transaction.c:893)
==18725== by 0x418185: process_notification (replication.c:576)
==18725== by 0x417705: replication_run (replication.c:185)
==18725== by 0x408240: main_loop (ovsdb-server.c:198)
==18725== by 0x406432: main (ovsdb-server.c:429)
I don't know the exact cause of the problem, but this new implementation
leaves me more confident due to its simplicity.
Reported-by: Joe Stringer <joe@ovn.org>
Reported-at: http://openvswitch.org/pipermail/dev/2016-September/079315.html Fixes: 60e0cd041958 ("ovsdb: Replication usability improvements") Signed-off-by: Ben Pfaff <blp@ovn.org> Tested-by: Joe Stringer <joe@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Joe Stringer [Fri, 9 Sep 2016 20:48:53 +0000 (13:48 -0700)]
ovsdb: Fix replication memory leak.
Valgrind reports:
==18725== 32 bytes in 1 blocks are definitely lost in loss record 339 of 497
==18725== at 0x4C29BBE: malloc (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==18725== by 0x450F1F: xmalloc (util.c:112)
==18725== by 0x41748E: replication_add_local_db (replication.c:137)
==18725== by 0x40803B: ovsdb_replication_init (ovsdb-server.c:146)
==18725== by 0x407C9E: ovsdb_server_connect_active_ovsdb_server
(ovsdb-server.c:1165)
==18725== by 0x450AB3: process_command (unixctl.c:313)
==18725== by 0x4500DC: run_connection (unixctl.c:347)
==18725== by 0x44FFB6: unixctl_server_run (unixctl.c:400)
==18725== by 0x4081AC: main_loop (ovsdb-server.c:182)
==18725== by 0x406432: main (ovsdb-server.c:429)
Fixes: 60e0cd041958 ("ovsdb: Replication usability improvements") Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Flavio Fernandes <flavio@flaviof.com> Acked-by: Ben Pfaff <blp@ovn.org>
A bitmap in 'struct group_table' is used to track all the allocated
group_ids. For every run of logical flows action parsing, we
add 'group_info' structure to a hmap called 'desired_groups'. The
group_id assigned to this group_info either comes from an already
installed 'existing groups' or a new reservation done in the bitmap.
In ofctrl_put(), if there is a backlog, we call ovn_group_table_clear().
This could unreserve a group_id that comes from an already existing group.
This could result in re-use of group_id in the future causing errors while
installing new groups.
This commit fixes the above scenario.
Signed-off-by: Gurucharan Shetty <guru@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
When there are hundreds of nodes controlled by OVN, the workflow
to track and allocate unique tags across multiple hosts becomes
complicated. It is much easier to let ovn-northd do the allocation.
Signed-off-by: Gurucharan Shetty <guru@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
ovn-northd: Add load-balancers to gateway routers.
Load-balancers in gateway routers lets us load-balance
north-south traffic.
This commit adds a new table called "DEFRAG" in the
logical router pipeline to defragment packets and to track them.
Once the packet is tracked, new connections get a group id as
an action. The group in turn chooses a DNAT action. Established
connections go through the DNAT table for a regular DNAT.
Signed-off-by: Gurucharan Shetty <guru@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
ovn-controller: Datapath based conntrack zone for load-balancing.
Currently ct_lb() logical action is only added for a logical switch and
we use the conntrack zone allocated for the logical port. A future commit
will use ct_lb() for a logical router too. In that case, use the allocated
DNAT zone.
Rationale for not passing zone as an argument for ct_lb():
One way to look at it would be that a "zone" is an internal implementation
detail and should not be seen in a action of logical flow. But we can then
say that we could rename "zone" as "datapath" in the logical action. But,
then we would be limiting it to 2 anyway (datapath=lswitch or
datapath=lrouter) - in which case we are inferring it with the current patch.
Signed-off-by: Gurucharan Shetty <guru@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Joe Stringer [Fri, 2 Sep 2016 00:01:55 +0000 (17:01 -0700)]
datapath: Use pre-routing hook for conntrack.
The upstream code uses NF_INET_PRE_ROUTING hook for the nf_conntrack_in()
call, which does deeper (eg l4proto) validation. It was previously
thought that using the NF_INET_ROUTING hook for this function on older
kernels would trigger kernel panics due to a dependency on the
unpopulated skb->dev, however during recent testing on a variety of
platforms (Centos7.[12], Ubuntu 1[46].04, Fedora23) using the latest
distribution kernels and the OVS kernel module testsuite, no such kernel
panics were observed. Therefore it appears to be safe to bring this in
line with upstream without any other workarounds.
Ryan Moats [Fri, 9 Sep 2016 12:36:47 +0000 (07:36 -0500)]
ovn-nbctl, tests: Clean up noisy memory leaks
When run with valgrind, ovn-nbctl.c and tests/test-ovn.c reveal
memory leaks of their own. This patch cleans these up so that
they don't create noise when looking for leaks in the OVN daemon
processes.
Signed-off-by: Ryan Moats <rmoats@us.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
rhel: add option to run kernel datapath test when building rpms
Add ability to execute kernel datapath tests when building rpms.
These tests are disabled by default, and can optionally be run
by providing "--with check_datapath_kernel" on the rpmbuild command
line. This is intended to facilitate automated testing, and
should not be used in production environments (it is generally not
recommended to run rpmbuild as root).
ovn-controller: Fix match crieria for dynamic mac binding flows
match struct is not initialized before adding flows for each entry in
mac_bindings table. The matches for IPv4 and IPv6 entries don't have
exactly the same form (IPv4 uses reg0, IPv6 uses xxreg0), so reusing
a match structure can cause problems.
Signed-off-by: Chandra Sekhar Vejendla <csvejend@us.ibm.com> Signed-off-by: Ryan Moats <rmoats@us.ibm.com> Co-authored-by: Ryan Moats <rmoats@us.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>