ofproto-dpif-xlate: Allow translating without side-effects.
Extend 'may_learn' attribute to also control the treatment of
FIN_TIMEOUT action and asynchronous messages (packet ins,
continuations), so that when 'may_learn' is 'false' and
'resubmit_stats' is 'NULL', no OpenFlow-visible side effects are
generated by the translation.
Correspondingly, add support for one-time asynchronous messages to
xlate cache, so that all side-effects of the translation may be
executed at a later stage. This will be useful for bundle commits.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
ofproto: Use ofproto_flow_mod for learn execution from xlate cache.
Use ofproto_flow_mod with a reference to an existing or new rule
instead of ofputil_flow_mod for learn action execution from xlate
cache
Typically we would find that when a learn xlate cache entry is
created, a preceding upcall has already created the learned flow. In
this case the xlate cache entry takes a reference to that flow and
keeps refreshing it without needing to perform any flow table lookups.
Otherwise the creation of the xlate cache entry creates the new rule,
which is then subsequently added to the classifier. In both cases
this is both faster and shrinks the memory cost of each learn cache
entry from ~3.5kb to about 0.3kb.
If the learned rule does not yet exist, it is created and attached to
the ofproto_flow_mod, from which it is then added. If the referred
rule happens to expire, or is modified in any way and is thus removed
from the classifier tables, we create a new rule using the old rule as
a template, so that we can avoid storing the ofputil_flow_mod in all
cases.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
ofproto-dpif-xlate: Add xlate cache type XC_TABLE.
Xlate cache entry type XC_TABLE is required for the table stats
(number of misses and matches) to be correctly attributed.
It appears that table stats have been off ever since xlate cache was
introduced. This was now revealed by a PACKET_OUT unit test case in a
later patch that checks for table stats explicitly.
Later patches will need to create xlate cache entries from different
modules. This patch refactors the xlate cache code in preparation
without any functional changes, so that the changes are clearly
visible in the following patches.
The definition of XC_ENTRY_FOR_EACH() iterator macro is changed so
that it now does not take the xlate cache pointer to unify the usage
accross all call sites.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Make mac table update functions part of the mac-learning module, which
also helps in figuring what is the minimal set of struct flow fields
needed for the update. Use this to change the xlate cache entry for
XC_NORMAL to not take a copy of the struct flow, but only save the
in_port, dl_src, and some auxiliary fields. This reduces the memory
burden of XC_NORMAL by roughly 0.5kb.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Zongkai LI [Fri, 9 Sep 2016 06:39:17 +0000 (06:39 +0000)]
ovn-northd: add dhcpv6 stateless option support
This patch adds DHCPv6 stateless option support, to allow ovn native dhcpv6
work in stateless mode.
User can add new option dhcpv6_stateless with string value true in
DHCP_Options.options column, to let ovn dhcpv6 only reply other configurations
for DHCPv6 request messages come from VM/VIF ports, and let VM/VIF ports get
their IPv6 addresses configured via stateless way.
Signed-off-by: Zongkai LI <zealokii@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Add a few messages at INFO to help debug the vif lifecycle.
A logsearch on mac or ip helps debug what happened to the
vif and when. This helps easily correlate logs across CMS and ovn.
Logs appear like this:
2016-09-01T18:15:48Z|00014|binding|INFO|Claiming lport eee1a9af-7513-4540-9385-9e3972bfca05 for this chassis.
2016-09-01T18:15:48Z|00015|binding|INFO|Claiming fa:16:3e:01:c3:4a 10.0.0.7 fd93:b509:aa46:0:f816:3eff:fe01:c34a
2016-09-01T18:15:59Z|00016|pinctrl|INFO|DHCPOFFER fa:16:3e:01:c3:4a 10.0.0.7
2016-09-01T18:15:59Z|00017|pinctrl|INFO|DHCPACK fa:16:3e:01:c3:4a 10.0.0.7
2016-09-01T18:16:22Z|00018|binding|INFO|Releasing lport eee1a9af-7513-4540-9385-9e3972bfca05 from this chassis.
2016-09-01T18:16:22Z|00019|binding|INFO|Releasing fa:16:3e:01:c3:4a 10.0.0.7 fd93:b509:aa46:0:f816:3eff:fe01:c34a
Signed-off-by: Ramu Ramamurthy <ramu.ramamurthy@us.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ramu Ramamurthy [Tue, 30 Aug 2016 23:58:34 +0000 (23:58 +0000)]
ovn: add lsp-deletion and bcast-flow removal tests for localnet
Add 2 tests for scenarios around lsp-deletion and flow removal
which have escaped current unit tests.
This test depends on the following patch:
"ovn-controller: Back out incremental processing" and passes
after applying it, but fails currently on master.
1) In the following sequence of events,
createi&bind vif1, create&bind vif2, delete vif1
we find that the localnet patch port
got deleted, whereas it should exist because there is a
bound vif2.
2) The flow broadcasting to tunnels in table=32 must be deleted
when a localnet port gets bound, but we find that the flow remains
in table 32 causing broadcasts to both tunnels and localnet patch.
Signed-off-by: Ramu Ramamurthy <ramu.ramamurthy@us.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
lib: Create $(sysconfdir)/openvswitch upon install
In cases where dbdir and etcdir are not the same, there is a need
for creating etcdir (i.e. $(sysconfdir)/openvswitch) explicitly.
Note that there is no attempt being made here to make the etcdir
configurable as in "--with-dbdir".
Reported-at: http://openvswitch.org/pipermail/dev/2016-September/TBD.html Fixes: f973f2af2fd4 ("Make the location of the database separately configurable.") Signed-off-by: Flavio Fernandes <flavio@flaviof.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Windows: Extend support for binaries which allow detach
On Windows we require service_start to be called to parse and setup
requirements for '--detach' argument.
Affected binaries: ovn-trace, ovsdb-client, ovs-testcontroller.
Subsequent patches will be sent to adapt the tests with the new features.
Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Gurucharan Shetty <guru@ovn.org>
connmgr: Make connmgr_wants_packet_in_on_miss() lock-free.
Make connmgr_wants_packet_in_on_miss() use an atomic int instead of a
list traversal taking the 'ofproto_mutex'. This allows
connmgr_wants_packet_in_on_miss() to be called also when
'ofproto_mutex' is already held, and makes it faster, too.
ofproto: Change rule's 'removed' member to a tri-state 'state'.
As a rule may not be re-inserted to ofproto data structures, it is
cleaner to have three states for the rule, rather than just two. This
will be useful for managing learned flows in later patches.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
ofproto: Add a fixed bundle idle timeout of 10 seconds.
Timing out idle bundles frees memory that would effectively be leaked
if a long standing OpenFlow connection would fail to commit or discard
a bundle.
OpenFlow specification mandates the timeout to be at least one second,
if the switch implements such a timeout. This patch makes the bundle
idle timeout to be 10 seconds.
We do not limit the number of messages in a bundle, so it does not
make sense to limit the number of bundles either, especially now that
idle bundles are timed out.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Add "bundle" keyword to ofp-print.at tests about bundle messages.
Add a missing ofp-print.at test for bundle group mods.
Remove "monitor" keyword from ofproto.at tests that do not use a monitor.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Fix the legal notice section in OVSEXT.SYS properties. Update the MSI to
include the properties mentioned in MSDN - 'Extension driver MSI packaging
requirements' section -
https://msdn.microsoft.com/windows/hardware/drivers/network/extension-driver-msi-packaging-requirements
Ben Pfaff [Sun, 11 Sep 2016 04:23:22 +0000 (21:23 -0700)]
replication: Be more careful about JSON parsing and simplify code.
The code here wasn't careful about parsing JSON received from the remote
OVSDB server. It assumed, for example, that a row that the remote server
implied was new was actually new, without looking to see whether there was
already a row with that UUID. This commit improves this validation. It
also rewrites code that translated updates locally into calls into the
query engine, via JSON, into simple lookups by UUID.
For me, this fixes a test failure in test 1866
(ovsdb-server/active-backup-role-switching), which caused the following
valgrind report:
==18725== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==18725== Access not within mapped region at address 0x0
==18725== at 0x43937E: ovsdb_datum_compare_3way (ovsdb-data.c:1626)
==18725== by 0x439344: ovsdb_datum_equals (ovsdb-data.c:1616)
==18725== by 0x4166CC: update_monitor_row_data (monitor.c:310)
==18725== by 0x414A90: ovsdb_monitor_changes_update (monitor.c:1255)
==18725== by 0x417009: ovsdb_monitor_change_cb (monitor.c:1339)
==18725== by 0x41DB52: ovsdb_txn_for_each_change (transaction.c:906)
==18725== by 0x416CC9: ovsdb_monitor_commit (monitor.c:1553)
==18725== by 0x41D993: ovsdb_txn_commit_ (transaction.c:868)
==18725== by 0x41D6F5: ovsdb_txn_commit (transaction.c:893)
==18725== by 0x418185: process_notification (replication.c:576)
==18725== by 0x417705: replication_run (replication.c:185)
==18725== by 0x408240: main_loop (ovsdb-server.c:198)
==18725== by 0x406432: main (ovsdb-server.c:429)
I don't know the exact cause of the problem, but this new implementation
leaves me more confident due to its simplicity.
Reported-by: Joe Stringer <joe@ovn.org>
Reported-at: http://openvswitch.org/pipermail/dev/2016-September/079315.html Fixes: 60e0cd041958 ("ovsdb: Replication usability improvements") Signed-off-by: Ben Pfaff <blp@ovn.org> Tested-by: Joe Stringer <joe@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Joe Stringer [Fri, 9 Sep 2016 20:48:53 +0000 (13:48 -0700)]
ovsdb: Fix replication memory leak.
Valgrind reports:
==18725== 32 bytes in 1 blocks are definitely lost in loss record 339 of 497
==18725== at 0x4C29BBE: malloc (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==18725== by 0x450F1F: xmalloc (util.c:112)
==18725== by 0x41748E: replication_add_local_db (replication.c:137)
==18725== by 0x40803B: ovsdb_replication_init (ovsdb-server.c:146)
==18725== by 0x407C9E: ovsdb_server_connect_active_ovsdb_server
(ovsdb-server.c:1165)
==18725== by 0x450AB3: process_command (unixctl.c:313)
==18725== by 0x4500DC: run_connection (unixctl.c:347)
==18725== by 0x44FFB6: unixctl_server_run (unixctl.c:400)
==18725== by 0x4081AC: main_loop (ovsdb-server.c:182)
==18725== by 0x406432: main (ovsdb-server.c:429)
Fixes: 60e0cd041958 ("ovsdb: Replication usability improvements") Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Flavio Fernandes <flavio@flaviof.com> Acked-by: Ben Pfaff <blp@ovn.org>
A bitmap in 'struct group_table' is used to track all the allocated
group_ids. For every run of logical flows action parsing, we
add 'group_info' structure to a hmap called 'desired_groups'. The
group_id assigned to this group_info either comes from an already
installed 'existing groups' or a new reservation done in the bitmap.
In ofctrl_put(), if there is a backlog, we call ovn_group_table_clear().
This could unreserve a group_id that comes from an already existing group.
This could result in re-use of group_id in the future causing errors while
installing new groups.
This commit fixes the above scenario.
Signed-off-by: Gurucharan Shetty <guru@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
When there are hundreds of nodes controlled by OVN, the workflow
to track and allocate unique tags across multiple hosts becomes
complicated. It is much easier to let ovn-northd do the allocation.
Signed-off-by: Gurucharan Shetty <guru@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
ovn-northd: Add load-balancers to gateway routers.
Load-balancers in gateway routers lets us load-balance
north-south traffic.
This commit adds a new table called "DEFRAG" in the
logical router pipeline to defragment packets and to track them.
Once the packet is tracked, new connections get a group id as
an action. The group in turn chooses a DNAT action. Established
connections go through the DNAT table for a regular DNAT.
Signed-off-by: Gurucharan Shetty <guru@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
ovn-controller: Datapath based conntrack zone for load-balancing.
Currently ct_lb() logical action is only added for a logical switch and
we use the conntrack zone allocated for the logical port. A future commit
will use ct_lb() for a logical router too. In that case, use the allocated
DNAT zone.
Rationale for not passing zone as an argument for ct_lb():
One way to look at it would be that a "zone" is an internal implementation
detail and should not be seen in a action of logical flow. But we can then
say that we could rename "zone" as "datapath" in the logical action. But,
then we would be limiting it to 2 anyway (datapath=lswitch or
datapath=lrouter) - in which case we are inferring it with the current patch.
Signed-off-by: Gurucharan Shetty <guru@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Joe Stringer [Fri, 2 Sep 2016 00:01:55 +0000 (17:01 -0700)]
datapath: Use pre-routing hook for conntrack.
The upstream code uses NF_INET_PRE_ROUTING hook for the nf_conntrack_in()
call, which does deeper (eg l4proto) validation. It was previously
thought that using the NF_INET_ROUTING hook for this function on older
kernels would trigger kernel panics due to a dependency on the
unpopulated skb->dev, however during recent testing on a variety of
platforms (Centos7.[12], Ubuntu 1[46].04, Fedora23) using the latest
distribution kernels and the OVS kernel module testsuite, no such kernel
panics were observed. Therefore it appears to be safe to bring this in
line with upstream without any other workarounds.
Ryan Moats [Fri, 9 Sep 2016 12:36:47 +0000 (07:36 -0500)]
ovn-nbctl, tests: Clean up noisy memory leaks
When run with valgrind, ovn-nbctl.c and tests/test-ovn.c reveal
memory leaks of their own. This patch cleans these up so that
they don't create noise when looking for leaks in the OVN daemon
processes.
Signed-off-by: Ryan Moats <rmoats@us.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
rhel: add option to run kernel datapath test when building rpms
Add ability to execute kernel datapath tests when building rpms.
These tests are disabled by default, and can optionally be run
by providing "--with check_datapath_kernel" on the rpmbuild command
line. This is intended to facilitate automated testing, and
should not be used in production environments (it is generally not
recommended to run rpmbuild as root).
ovn-controller: Fix match crieria for dynamic mac binding flows
match struct is not initialized before adding flows for each entry in
mac_bindings table. The matches for IPv4 and IPv6 entries don't have
exactly the same form (IPv4 uses reg0, IPv6 uses xxreg0), so reusing
a match structure can cause problems.
Signed-off-by: Chandra Sekhar Vejendla <csvejend@us.ibm.com> Signed-off-by: Ryan Moats <rmoats@us.ibm.com> Co-authored-by: Ryan Moats <rmoats@us.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
The following leaks are due to missing ds_destroy in a few
places in build_acl.
5,850 bytes in 50 blocks are definitely lost in loss record 93 of 93
at 0x4C29BFD: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0x4C2BACB: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0x449507: xrealloc (util.c:123)
by 0x42CC73: ds_reserve (dynamic-string.c:63)
by 0x42D08F: ds_put_format_valist (dynamic-string.c:161)
by 0x42D176: ds_put_format (dynamic-string.c:142)
by 0x40D380: build_acls (ovn-northd.c:2320)
by 0x40D380: build_lswitch_flows.constprop.36 (ovn-northd.c:2472)
by 0x4072D9: build_lflows (ovn-northd.c:3845)
by 0x4072D9: ovnnb_db_run (ovn-northd.c:3971)
by 0x4072D9: main (ovn-northd.c:4375)
9,360 bytes in 72 blocks are definitely lost in loss record 93 of 93
at 0x4C29BFD: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0x4C2BACB: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0x449507: xrealloc (util.c:123)
by 0x42CC73: ds_reserve (dynamic-string.c:63)
by 0x42D08F: ds_put_format_valist (dynamic-string.c:161)
by 0x42D176: ds_put_format (dynamic-string.c:142)
by 0x40D505: build_acls (ovn-northd.c:2346)
by 0x40D505: build_lswitch_flows.constprop.36 (ovn-northd.c:2472)
by 0x4072D9: build_lflows (ovn-northd.c:3845)
by 0x4072D9: ovnnb_db_run (ovn-northd.c:3971)
by 0x4072D9: main (ovn-northd.c:4375)
Signed-off-by: Ramu Ramamurthy <ramu.ramamurthy@us.ibm.com> Acked-by: Ryan Moats <rmoats@us.ibm.com> Signed-off-by: Russell Bryant <russell@ovn.org>
Joe Stringer [Wed, 7 Sep 2016 21:07:41 +0000 (14:07 -0700)]
system-traffic: Add FTP NAT test without seqadj.
The existing FTP with NAT tests all perform NATing from an IP like
10.1.1.1 -> 10.1.1.240, which requires adjusting the length of FTP
control messages as they pass through the connection tracker.
Occasionally this is a source of kernel bugs, so it is useful to have a
regular FTP NAT test between IPs that do not change the message length
in FTP control messages (eg, 10.1.1.1 -> 10.1.1.9) to more clearly
identify failures in this area.
Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
xlate: Clarify comment about mac learning table entry locking.
The rationale for locking mac learning table entires wrt. gratuitous
ARP packets and bond interfaces was too cryptic for me to understand.
After reading vswitchd/INTERNALS the issue is understandable, but we
can still improve the comment to prevent such confusion in future.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Andy Zhou [Tue, 23 Aug 2016 11:05:11 +0000 (04:05 -0700)]
ovsdb: Replication usability improvements
Based on feedbacks from initial HA manager integration, added the
'--active' command line option and appctl command
"ovsdb-server/sync-status. See man page updates for details.
Added the RPL_S_INIT state in the state machine. This state is
not strictly necessary for the replication state machine, but is
introduced to make sure the state is update immediately when
the state machine is reset, via replication_init(). Without it
ovsdb/sync-status may display "replicating" or crash, if the command
is issued between after replication_init() is called, but before
the state variable is updated from replication_run().
Added a test to simulate the integration of HA manager with OVSDB
server using replication.
Other documentation and API improvements.
Tested-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Fri, 2 Sep 2016 23:07:42 +0000 (16:07 -0700)]
tests: Fix recently broken sFlow tests.
A recent improvement to the promptness of sFlow reporting caused some of
the sFlow tests to fail (because the output was reported sooner). This
fixes up sequence numbers in the expected output to match the new behavior.
It also reduces the amount of (virtual) time that the test waits since it's
no longer necessary to wait as long.
ofproto: Honor mtu_request even for internal ports.
By default Open vSwitch tries to configure internal interfaces MTU to
match the bridge minimum, overriding any attempt by the user to
configure it through standard system tools, or the database.
While this works in many simple cases (there are probably many users
that rely on this) it may create problems for more advanced use cases
(like any overlay networks).
This commit allows the user to override the default behavior by
providing an explict MTU in the mtu_request column in the Interface
table.
This means that Open vSwitch will now treat differently database MTU
requests from standard system tools MTU requests (coming from `ip link`
or `ifconfig`), but this seems the best way to remain compatible with
old users while providing a more powerful interface.
Suggested-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ben Pfaff <blp@ovn.org> Tested-by: Joe Stringer <joe@ovn.org>
Revert "ofproto: Always set MTU for new internal ports."
This reverts commit 47bf118665a3d0f3c153d1fe80e9af02ac9a4e9c.
While the commit tries to make it more consistent, it breaks some system
tests. The assumptions made on the tests are probably made by many
users, so it's better to revert it.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Fri, 2 Sep 2016 20:26:50 +0000 (13:26 -0700)]
learn: Fix iteration over learning specs.
struct ofpact_learn_spec is variable-length. The 'n_specs' member of
struct ofpact_learn counted the number of specs, but the iteration loops
over struct ofpact_learn_spec only iterated as far as the *minimum* length
of 'n_specs' specs.
This fixes the problem, which exhibited as consistent failures for test 431
(learning action - TCPv6 port learning), seemingly only on i386 since it
shows up for my personal development machine but appears to not happen for
anyone else.
Fixes: dfe191d5faa6 ("ofp-actions: Waste less memory in learn actions.") Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
This patch changes the order of the steps that are followed
every second in the sFlow agent. By moving the receiver_tick()
step to the end, we ensure that any counters that were polled
during the poller_tick() step are flushed immediately to the
sFlow collector. This eliminates what was a variable time-delay
between counters being polled and being flushed.
The variable time-delay that this eliminates could be up to
a second because counters lingering in the output buffer could be
flushed at any time by the arrival of random packet-samples.
Since the sFlow standard does not require that a poll-timestamp be sent
along with the counters the collector must use his receive-time as the
timestamp, so that extra second of variable delay was "stretching or
shrinking" the time between successive counter readings. This
affected any counter-rate calculation that was based only on the delta
between sucessive samples. The effect was small with a polling
interval of 60 seconds: just +/- 2%. But the effect grew larger
when faster polling was configured. For example, if the counters
were pushed every 5 seconds then the instantaneous rate
calculations could wander by +/- 20%. For a thorough analysis
of this problem, see Rick Jones' paper:
"High Frequency sFlow v5 Counter Sampling"
ftp://ftp.netperf.org/papers/high_freq_sflow/hf_sflow_counters.pdf
So this patch makes it possible to obtain usable results even
when high-frequency polling is configured.
Signed-off-by: Neil McKee <neil.mckee@inmon.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
ovsdb-idlc: Fix logic error in IDL parse function.
This was found due to a build error when adding an ovsschema column
with
"type": {"key": "string", "value": "integer"}
with no min or max, only a single instance.
I am rather unfamiliar with IDL, so no tests have been added yet.
I could use some pointers, or someone familiar with IDL tests could
take over.
Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
learn: Avoid nested zero-sized arrays to fix build with MSVC.
Avoid using nested zero-sized arrays to allow compilation with MSVC.
Also, make sure the immediate data is accessed only if it exists, and
that the size is always calculated from struct learn_spec field
'n_bits'.
Fixes: dfe191d5faa6 ("ofp-actions: Waste less memory in learn actions.") Reported-by: Alin Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Joe Stringer [Wed, 31 Aug 2016 18:06:05 +0000 (11:06 -0700)]
upcall: Replace ukeys for deleted flows.
If a revalidator dumps/revalidates a flow during the 'dump' phase,
resulting in the deletion of the flow, then the ukey state moves into
UKEY_EVICTED, and the ukey is kept around until the 'sweep' phase. The
ukey is kept around to ensure that cases like duplicated dumps from the
datapaths do not result in multiple attribution of the same stats.
However, if an upcall for this flow comes for a handler between the
revalidator 'dump' and 'sweep' phases, the handler will lookup the ukey
and find that the ukey exists, then skip installing a new flow entirely.
As a result, for this period all traffic for the flow is slowpathed.
If there is a lot of traffic hitting this flow, then it will all be
handled in userspace until the 'sweep' phase. Eventually the
revalidators will reach the sweep phase and delete the ukey, and
subsequently the handlers should install a new flow.
To reduce the slowpathing of this traffic during flow table transitions,
allow the handler to identify this case during miss upcall handling and
replace the existing ukey with a new ukey. The handler will then be able
to install a flow for this traffic, allowing the traffic flow to return
to the fastpath.
Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
Joe Stringer [Wed, 31 Aug 2016 18:06:04 +0000 (11:06 -0700)]
upcall: Track ukey states.
Ukeys have a defined lifetime that starts from being created, inserted
into the umaps, having the corresponding flow installed, then the flow
deleted, the ukey removed from the umap, rcu-deferral of its deletion,
and finally freedom.
However, until now it's all been represented behind a simple boolean
"flow_exists" with a bunch of implicit logic sprinkled around the
accessors. This patch attempts to make the ukey lifetime a bit clearer
by outlining the correct transitions and asserting that their lifetime
proceeds as expected.
This should improve the readability of the current code, and also make
the following patch easier to reason about.
Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
Joe Stringer [Wed, 31 Aug 2016 18:06:03 +0000 (11:06 -0700)]
upcall: Only init flow_put if ukey is installed.
Currently when processing a batch of upcalls, all datapath operations
are first initialized, then later the corresponding ukeys are installed.
If the ukey_install fails at this later point, then the code needs to
backtrack a bit to delete the ukey and skip using the initialized
datapath op.
It's a little simpler to only initialize the datapath operation if the
ukey could actually be installed. The locks are held longer, but these
locks aren't heavily contended and the extended holding of the lock will
be removed in a subsequent patch anyway.
Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
Ben Pfaff [Thu, 1 Sep 2016 17:02:53 +0000 (10:02 -0700)]
ovn-controller: Fix memory leak in recv_S_TLV_TABLE_REQUESTED().
Nothing freed 'reply'. This fixes the problem.
Most of this patch is moving coding around. The essential change is that
breaking the code that works with 'reply' out into a separate function
makes it possible to catch all paths out of the function so that 'reply'
can be freed in one place.
Reported-by: Ryan Moats <rmoats@us.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Ryan Moats <rmoats@us.ibm.com> Acked-by: Flavio Fernandes <flavio@flaviof.com>
Ben Pfaff [Wed, 31 Aug 2016 21:25:41 +0000 (14:25 -0700)]
ovn-controller: Drop incremental processing from encapsulation code.
This commit reverts encaps.c to its content just before commit 1d45d5a9666d
(ovn-controller: Change encaps_run to work incrementally.). I then
reintroduced the UDP checksum support originallly added in commit 36283d7884f3 (ovn-controller: Use UDP checksums when creating Geneve
tunnels.) I also read the other commits following the incremental
processing commit to verify that this change didn't lose any bug fixes.
This commit takes advantage of the "addvalue" and "delvalue" functions
now available in the IDL to simplify some code.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Russell Bryant <russell@ovn.org>
Ben Pfaff [Wed, 31 Aug 2016 21:25:40 +0000 (14:25 -0700)]
ovsdb-idlc: Make set and map update operations take const arguments.
In a call like "ovsrec_bridge_update_ports_delvalue(bridge, port)", there's
no reason for the port argument to be nonconst, because the call doesn't
do anything to the port at all--it only searches the list of ports in the
bridge for that particular port and, if it finds it, removes it.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Russell Bryant <russell@ovn.org>
Andy Zhou [Tue, 23 Aug 2016 20:57:37 +0000 (13:57 -0700)]
ovsdb: Reimplement replication. Using a state machine.
Current replication uses blocking transactions, which are error prone
in practice, especially in handling RPC connection flapping to the
active server.
Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Andy Zhou [Wed, 17 Aug 2016 20:56:02 +0000 (13:56 -0700)]
ovsdb: Add request_ids
When starting, the replication logic may issue multiple requests at
a time, for example, one monitor request for each databases. The
request_ids keeps track of all outsanding request IDs that are used
for matching reply message with. It also provides the 'db' context
for the reply.
Future patches will make use of this facility.
Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Andy Zhou [Tue, 16 Aug 2016 21:56:19 +0000 (14:56 -0700)]
ovsdb: Add blacklist_tables
Currently, 'sync-exclude-tables' command line options are simply stored
in a string. Change the implementation to store it in an shash instead
to improve modularity.
One additional benefit of this change is that errors can be detected
and reported to user earlier. Adde a 'dryrun' option to
set_blacklist_tables() API to make this feature available to the
command line option parsing and unixctl command parsing.
Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Wed, 31 Aug 2016 19:43:45 +0000 (12:43 -0700)]
ovn-controller: Unpersist lflow data structures for address sets.
With the removal of incremental processing, it is no longer
necessary to persist the data structures for storing address
sets. Simplify things by removing this complexity.
Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
[blp@ovn.org deleted more unnecessary code] Signed-off-by: Ben Pfaff <blp@ovn.org>
Paul Boca [Tue, 30 Aug 2016 12:00:58 +0000 (12:00 +0000)]
python tests: Skip python tests that kill the python daemon
If the python script is killed with `kill` command, the atexit
handler doesn't gets executed on Windows.
The kill of the process is done using NtTerminateProcess which
doesn't send a signal to the process itself, if just terminates the
process from kernel mode.
Nithin Raju [Wed, 31 Aug 2016 10:33:01 +0000 (03:33 -0700)]
datapath-windows: remove invalid ASSERT in Flow.c
Since the Geneve changes, the key->l2.offset will no longer be 0 when
the tunnel key is valid within the OVS flow key. key->l2.offset would
be determined by the amount of tunnel options.
Ryan Moats [Wed, 31 Aug 2016 15:22:43 +0000 (15:22 +0000)]
ovn-controller: Back out incremental processing
As [1] indicates, incremental processing hasn't resulted
in an improvement worth the complexity it has added.
This patch backs out all of the code specific to incremental
processing, along with the persisting of OF flows,
logical ports, multicast groups, all_lports, local and patched
datapaths.
Persisted objects in the ovn/controller/physical.c module will
be used by a future patch set to determine if physical changes
have occurred.
Future patch sets in the series will convert
the ovn/controller/encaps.c module back to full processing
and remove the persistance of address sets in the
ovn/controller/lflow.c module.
Jarno Rajahalme [Wed, 31 Aug 2016 15:43:48 +0000 (08:43 -0700)]
ofp-actions: Waste less memory in set field and load actions.
Change the value and mask to be added to the end of the set field
action without any extra bytes, exept for the usual ofp-actions
padding to 8 bytes. Together with some structure member packing this
saves on average about to 256 bytes for each set field and load action
(as set field internal representation is also used for load actions).
On a specific production data set each flow entry uses on average
about 4.2 load or set field actions. This means that with this patch
an average of more than 1kb can be saved for each flow with such a
flow table.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Jarno Rajahalme [Wed, 31 Aug 2016 15:43:48 +0000 (08:43 -0700)]
ofp-actions: Waste less memory in learn actions.
Make the immediate data member 'src_imm' of a learn spec allocated at
the end of the action for just the right size. This, together with
some structure packing saves on average of ~128 bytes for each learn
spec in each learn action. Typical learn actions have about 4 specs
each, so this amounts to saving about 0.5kb for each learn action.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
We only change the MTU of new internal ports if it is bigger than the
bridge minimum. But when the minimum MTU of the bridge is updated we
change the MTU of all internal ports no matter what.
The behavior is inconsistent, because now the internal ports MTU depends
on the order in which the ports were added.
This commit fixes the problem by _always_ setting the MTU of new
internal ports to the bridge minimum. I'm not sure what was the logic
behind only adjusting the mtu if it was too big.
A testcase is improved to detect the problem.
VMware-BZ: #1718776 Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ben Pfaff <blp@ovn.org>
Jesse Gross [Tue, 30 Aug 2016 21:04:12 +0000 (14:04 -0700)]
ovs-save: Restore tunnel TLV map before flows.
Scripts that integrate OVS with a distribution often save and
restore flows across disruptive events, such as an upgrade. The
ovs-save utility generates a script to assist with this.
When flows include tunnel metadata, we also need to restore the
TLV mappings before the flows are re-added. Otherwise, the instance
of OVS receiving the new flows won't know the meaning of these
fields and will ignore them.
Jesse Gross [Mon, 29 Aug 2016 17:54:19 +0000 (10:54 -0700)]
ovs-ofctl: Extract tunnel metadata correctly when sorting flows.
When flow fields are sorted before dumping in ovs-ofctl, each
significant field is extracted for sorting. However, in the case of
tunnel metadata a mapping table is necessary to know where each
field begins and ends. This information is current stripped off before
fetching the field data and returned field is simply zeroed. This
makes sorting based on tunnel metadata non-deterministic.
We have the tunnel allocation stored in match metadata with each
flow, so we can simply extract the data from there rather than
trying to build and populate a global mapping table.
Signed-off-by: Jesse Gross <jesse@kernel.org> Acked-by: Ben Pfaff <blp@ovn.org>
Jesse Gross [Sun, 28 Aug 2016 23:17:22 +0000 (16:17 -0700)]
ovs-ofctl: Fix crash with replace-flows and diff-flows with tunnel metadata.
When flows are read by ovs-ofctl (either from a switch or a file),
tunnel metadata space is dynamically allocated since there isn't a
preset table. This works well for single flows but doesn't handle
groups of flows that must be compared to each other. In this case,
each flow will have its own independent allocation making comparisons
meaningless.
Even worse is that when these matches are later serialized (either
for display or in NXM format), the metadata allocation has been
stripped off of the matches. The serialization code then attempts to
use the global table, which is also not available, leading to a
dereference of a NULL pointer.
Solving this problem requires building an overall metadata table.
Since we don't know the maximum size of a field (particularly for
flows read from a file), it's necessary to do this in two passes.
The first pass records the maximum size for each field as well as
stores the received matches. The second pass creates a metadata
table based on the sizes, adjusts the match layout based on the new
allocation, and then replays the stored matches for comparison.
Later serialization will used the generated table to output the
flows.
Signed-off-by: Jesse Gross <jesse@kernel.org> Acked-by: Ben Pfaff <blp@ovn.org>
Ansis Atteka [Thu, 4 Aug 2016 10:58:32 +0000 (03:58 -0700)]
ipsec: Do not allow ipsec_gre tunnel traffic to exit unencrypted
If ipsec_gre tunnel configuration is changed in OVSDB,
then GRE packets may sometimes exit unencrypted until
per-tunnel IPsec policies are installed by ovs-monitor-ipsec
daemon.
This patch fixes this issue by installing single, low
priority IPsec block policy that drops all GRE packets
coming out from ipsec_gre tunnels that do not have yet
their own IPsec policies installed.
This patch depends on to two other recently committed
patches:
1. 574ff4aa (tunneling: get skb marking to work
properly with tunnels)
2. ca3574d5 (IPsec: refactor out some code in
OVS_MONITOR_IPSEC_START macro)
Amitabha Biswas [Wed, 24 Aug 2016 05:12:30 +0000 (22:12 -0700)]
ovsdb: Fix mutation of newly inserted rows from Python IDL.
This patch fixes the scenario, where the mutate operation on a row
is sent in the same transaction as row insert operation. It was
obvserved that this mutate operation was not getting committed
to the OVSDB.
To get around the above problem the "where" condition in an
mutate operation is modified to use the named-uuid to identify
a row created in the current transaction.
Signed-off-by: Amitabha Biswas <abiswas@us.ibm.com> Suggested-by: Richard Theis <rtheis@us.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Jarno Rajahalme [Tue, 30 Aug 2016 17:20:51 +0000 (10:20 -0700)]
lib: Retire packet buffering feature.
OVS implementation of buffering packets that are sent to the
controller is not compliant with the OpenFlow specifications after
OpenFlow 1.0, which is possibly true since OpenFlow 1.0 is not really
specifying the packet buffering behavior.
OVS implementation executes the buffered packet against the actions of
the modified or added rule, whereas OpenFlow (since 1.1) specifies
that the packet should be matched against the flow table 0 and
processed accordingly.
Rather than fix this behavior, and potentially break OVS users, the
packet buffering feature is removed altogether. After all, such
packet buffering is an optional OpenFlow feature, and as such any
possible users should continue to work without this feature.
This patch also makes OVS check the received 'buffer_id' values more
rigorously, and fixes some internal users accordingly.
Found by inspection.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Paul Boca [Tue, 30 Aug 2016 12:00:57 +0000 (12:00 +0000)]
python tests: Skip python tests specific to Linux.
There is a difference between POSIX pid and Windows pid, not all the
time are equal. On Windows when a python script is started, a sh
command is triggered as the parent for script. So when we try to
get the daemon pid with 'echo $!', this will get the pid of sh
not of python.exe as expected. Some tests use undefined switches,
on Windows, for `kill` command.
Jarno Rajahalme [Mon, 29 Aug 2016 18:38:40 +0000 (11:38 -0700)]
vswitchd: Deprecate packet buffering in OVS 2.6.
OVS implementation of buffering packets that are sent to the
controller is not compliant with the OpenFlow specifications after
OpenFlow 1.0. OVS implementation executes the buffered packet against
the actions of the modified or added rule, whereas OpenFlow (since
1.1) specifies that the packet should be matched against the flow
table 0 and processed accordingly.
Rather than fix this behavior, and potentially break OVS users, we
propose to remove the feature altogether, starting in OVS 2.7. This
patch announces this in 'NEWS' for OVS 2.6, and adds detail to the FAQ
question about backet buffering.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Wed, 17 Aug 2016 20:58:13 +0000 (13:58 -0700)]
ovn-northd: Avoid excessive work to find router ports.
The ovn_datapath for each logical switch maintains an array of its ports
of type "router-port", but instead of iterating through it build_pre_acls()
iterated through all of the ports in the entire database, which is
wasteful and duplicative work. This commit switches to using the array of
router ports.
This change is best viewed ignoring white space only changes.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Ryan Moats <rmoats@us.ibm.com>
Ryan Moats [Wed, 24 Aug 2016 16:56:32 +0000 (16:56 +0000)]
ovn-controller: Fix memory leak when parsing lflow actions.
Parsing logical flow actions with ovnacts_parse* that include
string constants current leak memory. Add calls to ovnacts_free
to recapture said memory.
Signed-off-by: Ryan Moats <rmoats@us.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
ovn: Delete stale MAC_Bindings that result in Referential Integrity Violation
The MAC_Bindings have a strong reference to the Datapath_Binding. However the
MAC_Bindings are never deleted anywhere, and when the Datapath (associated
with a MAC_Binding) is deleted, the ovsdb-server returns Referential
Integrity Violation. This prevents newer operations initiated from the CMS
from being committed to the Southbound DB.
The patch fixes this by deleting the MAC_Binding entry when the
logical_port referred in the mac_binding entry is deleted.
Signed-off-by: Chandra Sekhar Vejendla <csvejend@us.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>