Han Zhou [Tue, 2 May 2017 20:22:35 +0000 (13:22 -0700)]
ovn-controller: Disable probes by default for unix sockets.
Normally the OVS JSON-RPC library does not probe idle connections across
Unix domain sockets, since the kernel can tell OVS whether the connections
are truly connected without probes, but ovn-controller carelessly
overrode that.
(This should not be an issue in typical OVN deployments, because the OVN SB
database is normally accessed via TCP or SSL.)
CC: Nirapada Ghosh <nghosh@us.ibm.com> Fixes: 715038b6b222 ("ovn-controller: reload configured SB probe timer") Signed-off-by: Han Zhou <zhouhan@gmail.com> Co-authored-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
Yi-Hung Wei [Mon, 1 May 2017 17:24:35 +0000 (10:24 -0700)]
system-traffic: Add test for mpls actions
Add ping test to verify the behavior of mpls_push/pop actions. In this
test, we use the resubmit action to trigger recirulation for making sure
the flow key is revalidated after mpls_push/pop. This test depends on
commit 5ba0c107c51e ("datapath: Fix ovs_flow_key_update()") to behave
correctly.
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Acked-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
ovs_flow_key_update() is called when the flow key is invalid, and it is
used to update and revalidate the flow key. Commit 329f45bc4f19
("openvswitch: add mac_proto field to the flow key") introduces mac_proto
field to flow key and use it to determine whether the flow key is valid.
However, the commit does not update the code path in ovs_flow_key_update()
to revalidate the flow key which may cause BUG_ON() on execute_recirc().
This patch addresses the aforementioned issue.
Fixes: 329f45bc4f19 ("openvswitch: add mac_proto field to the flow key") Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Acked-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Acked-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
openvswitch: correctly fragment packet with mpls headers
If mpls headers were pushed to a defragmented packet, the refragmentation no
longer works correctly after 48d2ab609b6b ("net: mpls: Fixups for GSO"). The
network header has to be shifted after the mpls headers for the
fragmentation and restored afterwards.
Fixes: 48d2ab609b6b ("net: mpls: Fixups for GSO") Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
openvswitch: mpls: set network header correctly on key extract
After the 48d2ab609b6b ("net: mpls: Fixups for GSO"), MPLS handling in
openvswitch was changed to have network header pointing to the start of the
MPLS headers and inner_network_header pointing after the MPLS headers.
However, key_extract was missed by the mentioned commit, causing incorrect
headers to be set when a MPLS packet just enters the bridge or after it is
recirculated.
Fixes: 48d2ab609b6b ("net: mpls: Fixups for GSO") Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
Yi-Hung Wei [Mon, 1 May 2017 17:24:31 +0000 (10:24 -0700)]
datapath: Fixups for MPLS GSO
This patch backports the following two upstream commits to fix MPLS GSO in
ovs datapath. Starting from upstream commit 48d2ab609b6b ("net: mpls: Fixups
for GSO"), the mpls_gso kernel module relies on the fact that
skb_network_header() points to the mpls header and skb_inner_network_header()
points to the L3 header so that it can derive the length of mpls header
correctly, and the upstream commit updates how ovs datapath marks the skb
header when push and pop mpls. However, the old mpls_gso kernel module
assumes that the skb_network_header() points to the L3 header, and the old
mpls_gso kernel module will misbehave if the ovs datapath marks the
skb_network_header() in the new way since it will treat mpls header as the L3
header.
Because of the functional signature of mpls_gso_segment() does not change,
this backport patch uses the new mpls_hdr() to determine if the kernel that
ovs datapath is compiled with has the new or legacy mpls_gso kernel module.
It has been tested on kernel 4.4 and 4.9.
As reported by Lennert the MPLS GSO code is failing to properly segment
large packets. There are a couple of problems:
1. the inner protocol is not set so the gso segment functions for inner
protocol layers are not getting run, and
2 MPLS labels for packets that use the "native" (non-OVS) MPLS code
are not properly accounted for in mpls_gso_segment.
The MPLS GSO code was added for OVS. It is re-using skb_mac_gso_segment
to call the gso segment functions for the higher layer protocols. That
means skb_mac_gso_segment is called twice -- once with the network
protocol set to MPLS and again with the network protocol set to the
inner protocol.
This patch sets the inner skb protocol addressing item 1 above and sets
the network_header and inner_network_header to mark where the MPLS labels
start and end. The MPLS code in OVS is also updated to set the two
network markers.
>From there the MPLS GSO code uses the difference between the network
header and the inner network header to know the size of the MPLS header
that was pushed. It then pulls the MPLS header, resets the mac_len and
protocol for the inner protocol and then calls skb_mac_gso_segment
to segment the skb.
Afterward the inner protocol segmentation is done the skb protocol
is set to mpls for each segment and the network and mac headers
restored.
Reported-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream commit:
commit 85de4a2101acb85c3b1dde465e84596ccca99f2c
Author: Jiri Benc <jbenc@redhat.com>
Date: Fri Sep 30 19:08:07 2016 +0200
openvswitch: use mpls_hdr
skb_mpls_header is equivalent to mpls_hdr now. Use the existing helper
instead.
Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
Ben Pfaff [Sun, 30 Apr 2017 21:03:02 +0000 (14:03 -0700)]
ovn-nbctl: Display and accept Neutron network, router, port names.
The names of these neutron:* keys in external_ids are unfortunate, but
they are the keys that the OVN utilities need to support if we want users
to be able to work with OpenStack in a convenient fashion rather than
having to cut and paste UUIDs everywhere.
This commit documents the meaning of these keys, in the hopes that other
CMS integrations will simply use them instead of inventing new ones.
Perhaps at some point we can clean this up, since bad names are a bad idea,
but it also would take a lot of coordination and probably multiple
releases.
Port names are slightly less useful in practice than switch or router names
because Neutron doesn't by default give names to ports. (You can add them
with "openstack port set --name", though.)
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Ben Pfaff [Thu, 27 Apr 2017 22:47:59 +0000 (15:47 -0700)]
db-ctl-base: Add support for identifying a row based on a value in a map.
This will be used in an upcoming commit to allow Datapath_Binding records
in the OVN southbound database to be identified based on external-ids:name
and other map values.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Ben Pfaff [Thu, 27 Apr 2017 20:33:12 +0000 (13:33 -0700)]
ovn-sbctl, ovn-nbctl, ovs-vsctl: Remove useless record id methods.
These only did anything if both the first two members of the struct were
nonnull, as you can see from the first test in get_row_by_id() in
lib/db-ctl-base.c, so these never did anything useful and I can't figure
out why they're there.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Russell Bryant <russell@ovn.org>
Ben Pfaff [Thu, 27 Apr 2017 16:36:36 +0000 (09:36 -0700)]
ovn-nbctl: Drop gratuitous indentation for "show" output.
"ovn-nbctl show" indented every line of output by at least 4 spaces, which
needlessly wastes horizontal space. This drops 4 spaces of indent from
each line of output.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org> Acked-by: Russell Bryant <russell@ovn.org>
Ben Pfaff [Sun, 30 Apr 2017 21:09:55 +0000 (14:09 -0700)]
uuid: Change semantics of uuid_is_partial_string().
Until now, uuid_is_partial_string() returned the number of characters at
the beginning of a string that were the beginning of a valid UUID. This
is useful, but all of the callers actually wanted to get a value of 0 if
the string contained a character that was invalid for a UUID. This makes
that change.
Examples:
"123" previously yielded 3 and still does.
"xyzzy" previously yielded 0 and still does.
"123xyzzy" previously yielded 3, now yields 0.
"e66250bb-9531-491b-b9c3-5385cabb0080" previously yielded 36, still does.
"e66250bb-9531-491b-b9c3-5385cabb0080xyzzy" previously yielded 36, now 0.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org> Acked-by: Russell Bryant <russell@ovn.org>
fedora: do not restart ovn svcs automatically on pkg upgrade
Similar to commit 5771f4765734 ("fedora: do not restart the
service on a pkg upgrade"), this change eliminates the
automatic restart of OVN services after upgrade.
Note that the post-uninstall scriptlet affected by this change
is executed from the previously installed package when upgrading,
so existing installations need to go through two package upgrades
before this change will take effect.
Signed-off-by: Lance Richardson <lrichard@redhat.com> Signed-off-by: Russell Bryant <rbryant@redhat.com>
Russell Bryant [Fri, 31 Mar 2017 15:27:23 +0000 (11:27 -0400)]
build: Don't run tests in rpm makefile targets.
The RPM build makefile targets are helpful during development and testing,
but I personally almost never want the tests to run when I use them.
Leave tests on by default in the spec file for when the package is built by
distro build systems, but disable it by default in the Makefile targets and
update the documentation accordingly.
Joe Stringer [Mon, 1 May 2017 19:58:06 +0000 (12:58 -0700)]
revalidator: Revalidate ukeys created from flows.
If there is no active ukey for a particular datapath flow, and it is
dumped from the datapath, then the revalidator threads will assemble a
ukey based on the datapath flow. This will allow tracking of the stats
for proper attribution, and future validation of the flow.
However, until now when creating the ukey in this context, the ukey's
'reval_seq' has been set to the current udpif's reval_seq. This implies
that the flow has been validated against the current flow table.
However, this is not true - The flow appeared in the datapath without
any prior knowledge in this OVS instance so we should set up the
reval_seq of the ukey to ensure that the flow will be validated during
the current dump/revalidation cycle.
Refer also revalidate_ukey().
Fixes: 23597df05226 ("upcall: Create ukeys in handler threads.") Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
ovn-northd: Add logical flows to support native DNS
OVN implements native DNS resolution which can be used to resolve the
internal DNS names belonging to a logical datapath.
To support this, a new table 'DNS' is added in the NB DB. A new column
'dns_records' is added in 'Logical_Switch' table which references to the
'DNS' table.
Following flows are added for each logical switch if configured with
DNS records in the 'dns_records' column
- A logical flow in DNS_LOOKUP stage which uses the action 'dns_lookup'
to transform the DNS query to DNS reply packet and advances
to the next stage - DNS_RESPONSE.
- A logical flow in DNS_RESPONSE stage which implements the DNS responder
by sending the DNS reply from previous stage back to the inport.
This patch adds a new OVN action 'dns_lookup' to support native DNS.
ovn-controller parses this action and adds a NXT_PACKET_IN2
OF flow with 'pause' flag set.
A new table 'DNS' is added in the SB DB to look up and resolve
the DNS queries. When a valid DNS packet is received by
ovn-controller, it looks up the DNS name in the 'DNS' table
and if successful, it frames a DNS reply, resumes the packet
and stores 1 in the 1-bit subfield. If the packet is invalid
or cannot be resolved, it resumes the packet without any
modifications and stores 0 in the 1-bit subfield.
reg0[4] = dns_lookup(); next;
An upcoming patch will use this action and adds logical flows.
Aaron Conole [Tue, 2 May 2017 20:17:48 +0000 (16:17 -0400)]
rhel: fix the fedora spec
When commit d0c961a99f57 ("lib/automake.mk: don't install
runtime directories") landed, it broke RPM based builds since
the requisite directories were no longer available. This commit
adds those directories back when making RPMs so that the package
manager can see them.
Ben Pfaff [Mon, 1 May 2017 20:19:43 +0000 (13:19 -0700)]
ovs-macros: Add helper to make 'wc' use POSIX compliant output format.
Several times, we've had to fix tests that used 'wc' and expected a
particular output format. POSIX is specific about the output format, but
neither GNU or BSD wc honors it. This commit makes whatever 'wc' is on
the system use the POSIX output format.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: YAMAMOTO Takashi <yamamoto@ovn.org>
Han Zhou [Sat, 22 Apr 2017 01:55:27 +0000 (18:55 -0700)]
ovn-controller: Avoid recomputing when there are in-flight msgs.
When there are in-flight msgs being sent to OVS, ofctrl_put will
skip, which makes all the flows computed in that main loop
iteration useless. To avoid the wasted CPU cycles, a check is added
before lflow/physical flow run in each iteration.
This has huge performance improvement in below testing:
- 1 lswitch with 10 lports bound locally
- Each lport has an ingress ACL, referencing the same address-set
- The address-set has 10,000 IPv4 addresses
For each IP address in the address-set, there will be 3
OpenFlow rules generated for each ACL. So the total number
of rules is 300k+.
Without the patch, it takes 50+ minutes to install all the
rules to ovs-vswitchd.
With the patch, it takes 16 seconds to install all the rules
to ovs-vswitchd.
The reason is that the large number of rules are sent to
ovs-vswitchd gradually in many iterations of ovn-controller
main loop. Without the patch, cpu cycles are wasted in
lflow_run to re-processing the large address set in every
main loop iteration. With the patch, this re-processing is
avoided in iterations when there are pending rules sending.
Signed-off-by: Han Zhou <zhouhan@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Aaron Conole [Mon, 1 May 2017 20:14:09 +0000 (16:14 -0400)]
checkpatch: fix pointer declaration
A common way of expressing 'raise to the power of' when authoring
comments uses **. This is currently getting caught by the pointer
spacing warning. So, catch it here.
Aaron Conole [Mon, 1 May 2017 20:14:08 +0000 (16:14 -0400)]
checkpatch: filename from hunks fix
Filenames that come from the hunks match include the git-ified 'b/'
prefix, which makes jumping to the error file that much harder. This
patch corrects that by simply skipping those bytes.
Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Aaron Conole [Mon, 1 May 2017 20:14:07 +0000 (16:14 -0400)]
checkpatch: print conformance
Other utilities (notoriously the linux kernel's checkpatch.pl) have a more
standardized form for printing file and lines. With this change, the
template used to print gains two enhancements:
1. Color
2. Conformance with the kernel's version of checkpatch.pl
Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Aaron Conole [Mon, 1 May 2017 20:14:06 +0000 (16:14 -0400)]
checkpatch: correct a parsing issue
Occasionally, characters will be sent which violate the
ascii decoder's sense of propriety. In fact, in-tree there are
a few such files (ex: tests/atlocal.in), and they cause an
exception to be raised when they are encountered.
Set the policy to ignore these cases. This means these bytes are
omitted from the text stream during processing.
Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Aaron Conole [Mon, 1 May 2017 20:14:03 +0000 (16:14 -0400)]
checkpatch: introduce a flexible framework
Developers wishing to add checks to checkpatch sift through an adhoc mess,
currently. The process goes something like:
1. Figure out what to test in the patch
2. Write some code, quickly, that checks for that condition
3. Look through the statemachine to find where the check should go
4. ignore parts of the above and just throw something together
That worked fine for the initial development, but as interesting new tests
are developed, it is important to have a more flexible framework that lets
a developer just plug in a new test, easily.
This commit brings in a new framework that allows plugging in checks very
quickly. Hook up the line-length test as an initial demonstration.
Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
The Open vSwitch run, log, and DB directories are installed as part of the
normal `make install` process. However, this means they are created with
user and group ownership that may conflict with the desired user. For
example, running `make install` as root will install those files as
root:root, whereas the runtime user desired may be openvswitch:openvswitch.
Since these directories are automatically created as part of the ovs-ctl
command, and with the correct user:group permissions, it makes sense to
delay creation until these directories are actually required.
Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
install-doc: suggest to use ovs-ctl for start/stop
The install documentation guided users to manually start/stop
daemons. This is good information to have, but with the
existence of ovs-ctl, is probably not the best way to start
guiding new users of ovs.
Suggest that users start by running ovs-ctl start, and
document the ability to selectively start/stop the daemons.
The ovs-ctl script is already mentioned a bit in the install
doc, so this just reinforces its use.
Suggested-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
William Tu [Sat, 29 Apr 2017 13:08:43 +0000 (06:08 -0700)]
doc: Fix sphinx reference warning for windows.
Footnote reference 5, 8, and 9 are not referenced in the windws.rst content,
causing the following error:
Warning, treated as error:
/root/ovs/Documentation/topics/windows.rst:506:Footnote [5] is not referenced.
Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
William Tu [Sat, 29 Apr 2017 13:30:59 +0000 (06:30 -0700)]
bridge: Prohibit "default" and "all" bridge name.
Under Linux, when users create bridge named "default" or "all", although
ovs-vsctl fails but vswitchd in the background will keep retrying it,
causing the systemd-udev to reach 100% cpu utilization. The patch prevents
any attempt to create or open a netdev named "default" or "all" because
these two names are reserved on Linux due to
/proc/sys/net/ipv4/conf/ always contains directories by these names.
The reason for high CPU utilization is due to frequent calls into kernel's
register_netdevice function, which will invoke several kernel elements who
has registered on the netdevice notifier chain. And due to creation failed,
OVS wakes up and re-recreate the device, which ends up as a high CPU loop.
VMWare-BZ: #1842388 Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Greg Rose <gvrose8192@gmail.com>
Ben Pfaff [Fri, 17 Mar 2017 20:43:47 +0000 (13:43 -0700)]
travis: Break Mac OS build for format specifier warnings.
Until now, the Travis build for Mac OS X has been configured to ignore
format specifier warnings. These warnings have now been fixed, so this
commit changes such warnings to error.
Suggested-by: Daniele Di Proietto <diproiettod@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
Joe Stringer [Fri, 28 Apr 2017 21:45:21 +0000 (14:45 -0700)]
ofp-actions: Document that learn(limit=0) is no limit.
The documentation was unclear that specifying a limit of 0 is the same
as specifying no limit. Controllers that wish to set a learn limit so
that no more than 0 flows are learned may omit the learn action.
Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
Unconditionally define OVS_CT_EVENT_* macros for the datapath netlink
interface so that we do not need to include platform dependent files.
This fixes the build on non-Linux (and non-Windows) platforms.
Also define a macro for the default set of events set by OVS userspace.
Reported-by: Joe Stringer <joe@ovn.org> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
Andy Zhou [Fri, 28 Apr 2017 21:42:00 +0000 (14:42 -0700)]
test/ofproto: Improve test 'controller action without megaflows'
Commit af7535e7dbeb9 expanded the test to check the output
of meter stats, but without strip out the duration time.
This makes the test sensitive to the speed of
the machine that runs the test. Strip away the timing information
to improve test reliability
Fixes: af7535e7dbeb9 (ofproto: Meter slowpath action when action
upcall meters are configured) Signed-off-by: Andy Zhou <azhou@ovn.org>
Greg Rose [Thu, 27 Apr 2017 23:13:12 +0000 (16:13 -0700)]
compat: Fix build error in kernels 4.10
Use the acinclude.m4 configuration file to check for the net parameter
that was added to the ipv4 and ipv6 frags init functions in the 4.10
Linux kernel to check whether DEFRAG_ENABLE_TAKES_NET should be set and
then check for that at compile time.
This is an alternative solution patch for the issue reported by Raymond
Burkholder and the patch submitted by Guoshuai Li.
[Committer notes]
Squash in "acinclude.m4: Add check for struct net parameter" which
provides the HAVE_DEFRAG_ENABLE_TAKES_NET.
Reported-by: Raymond Burkholder <ray@oneunified.net> CC: Guoshuai Li <ligs@dtdream.com> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Joe Stringer <joe@ovn.org>
Andy Zhou [Tue, 11 Apr 2017 23:10:41 +0000 (16:10 -0700)]
ofproto: Meter slowpath action when action upcall meters are configured
If a slow path action is a controller action, meter it when the
controller meter is configured. For other kinds of slow path actions,
meter it when the slowpath meter is configured.
Note, this patch only considers the meters configuration of the
packet's input bridge, which may not be the same bridge that the
action is generated.
Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
Andy Zhou [Fri, 31 Mar 2017 00:03:08 +0000 (17:03 -0700)]
ofproto-dpif: Add 'meter_ids' to backer
Add 'meter_ids', an id-pool object to manage datapath meter id, i.e.
provider_meter_id.
Currently, only userspace datapath supports meter, and it implements
the provider_meter_id management. Moving this function to 'backer'
allows other datapath implementation to share the same logic.
Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
Andy Zhou [Thu, 30 Mar 2017 22:37:27 +0000 (15:37 -0700)]
ofproto: Store meters using hmap
Currently, meters are stored in a fixed pointer array. It is not
very efficient since the controller, at least in theory, can
pick any meter id (up to the limits to uint32_t), not necessarily
within the lower end of a region, or in close range to each other.
In particular, OFPM_SLOWPATH and OFPM_CONTROLLER meters are specified
at the high region.
Switching to using hmap. Ofproto layer does not restrict
the number of meters that controller can add, nor does it care
about the value of meter_id. Datapth limits the number of meters
ofproto layer can support at run time.
Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
Specify the event mask with CT commit including bits for CT features
exposed at the OVS interface (mark and label changes in addition to
basic creation and destruction of conntrack entries).
Without this any listener of conntrack update events will typically
(depending on system configuration) receive events for each L4 (e.g.,
TCP) state machine change, which can multiply the number of events
received per connection.
By including the new, related, and destroy events any listener of new
conntrack events gets notified of new related and non-related
connections, and any listener of destroy events will get notified of
deleted (typically timed out) conntrack entries.
By including the flags for mark and labels, any listener of conntrack
update events gets notified whenever the connmark or conntrack labels
are changed from the values reported within the new events.
VMware-BZ: #1837218 Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
Jarno Rajahalme [Sat, 11 Mar 2017 00:10:41 +0000 (16:10 -0800)]
tests: ICMP related to original direction test.
Normally ICMP responses are in the reply direction of a conntrack
entry. This test exercises an ICMP response to the original direction
of the conntrack entry.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
openvswitch: Delete conntrack entry clashing with an expectation.
Conntrack helpers do not check for a potentially clashing conntrack
entry when creating a new expectation. Also, nf_conntrack_in() will
check expectations (via init_conntrack()) only if a conntrack entry
can not be found. The expectation for a packet which also matches an
existing conntrack entry will not be removed by conntrack, and is
currently handled inconsistently by OVS, as OVS expects the
expectation to be removed when the connection tracking entry matching
that expectation is confirmed.
It should be noted that normally an IP stack would not allow reuse of
a 5-tuple of an old (possibly lingering) connection for a new data
connection, so this is somewhat unlikely corner case. However, it is
possible that a misbehaving source could cause conntrack entries be
created that could then interfere with new related connections.
Fix this in the OVS module by deleting the clashing conntrack entry
after an expectation has been matched. This causes the following
nf_conntrack_in() call also find the expectation and remove it when
creating the new conntrack entry, as well as the forthcoming reply
direction packets to match the new related connection instead of the
old clashing conntrack entry.
Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action") Reported-by: Yang Song <yangsong@vmware.com> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
Upstream commit 5a8145f7b222 ("netfilter: labels: don't emit ct event
if labels were not changed"), released in Linux 4.7, changed
nf_connlabels_replace() to trigger conntrack event for a label change
only when the labels actually changed. Without this change an update
event is triggered even if the labels already have the values they are
being set to.
There is no way we can detect this functional change from Linux
headers, so provide replacements that work the same for older Linux
releases regardless if a distribution provides backports or not.
VMware-BZ: #1837218 Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
Add a new optional conntrack action attribute OVS_CT_ATTR_EVENTMASK,
which can be used in conjunction with the commit flag
(OVS_CT_ATTR_COMMIT) to set the mask of bits specifying which
conntrack events (IPCT_*) should be delivered via the Netfilter
netlink multicast groups. Default behavior depends on the system
configuration, but typically a lot of events are delivered. This can be
very chatty for the NFNLGRP_CONNTRACK_UPDATE group, even if only some
types of events are of interest.
Netfilter core init_conntrack() adds the event cache extension, so we
only need to set the ctmask value. However, if the system is
configured without support for events, the setting will be skipped due
to extension not being found.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Joe Stringer <joe@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
Joe Stringer [Thu, 27 Apr 2017 01:03:12 +0000 (18:03 -0700)]
revalidator: Improve logging for transition_ukey().
There are a few cases where more introspection into ukey transitions
would be relevant for logging or assertion. Track the SOURCE_LOCATOR and
thread id when states are transitioned and use these for logging.
Suggested-by: Jarno Rajahalme <jarno@ovn.org> Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Joe Stringer [Thu, 27 Apr 2017 01:03:11 +0000 (18:03 -0700)]
revalidator: Avoid assert in transition_ukey().
There is a case where a flow is dumped from the kernel after the ukey is
already transitioned into an EVICTING/EVICTED/DELETED state, and the
revalidator thread attempts to shift that into UKEY_OPERATIONAL because
it was able to dump the flow from the datapath. This resulted in
triggering the assert in transition_ukey(). Detect this condition and
skip handling the flow (as it's already on its way out).
Users report:
> Program terminated with signal SIGABRT, Aborted.
> raise () from /lib/x86_64-linux-gnu/libc.so.6
> raise () from /lib/x86_64-linux-gnu/libc.so.6
> abort () from /lib/x86_64-linux-gnu/libc.so.6
> ovs_abort_valist
> vlog_abort_valist
> vlog_abort
> ovs_assert_failure
> transition_ukey (ukey=<optimized out>, dst=<optimized out>)
> at ofproto/ofproto-dpif-upcall.c:1674
> revalidate (revalidator=0x1cb36c8) at ofproto/ofproto-dpif-upcall.c:2324
> udpif_revalidator (arg=0x1cb36c8) at ofproto/ofproto-dpif-upcall.c:901
> ovsthread_wrapper (aux_=<optimized out>) at lib/ovs-thread.c:348
> start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
> clone () from /lib/x86_64-linux-gnu/libc.so.6
VMware-BZ: #1857694 Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Russell Bryant [Wed, 19 Apr 2017 16:41:37 +0000 (12:41 -0400)]
ovn: Bump ovn-nb schema version.
Commit b89d25e5694b made the "router" DHCPv4 option optional instead of
mandatory. This did not actually change the schema, but there's no good
way for a client of the northbound database to know if this change is
present without bumping the schema version. This is needed for a client to
work with versions before and after this change.
Reported-at: https://bugs.launchpad.net/networking-ovn/+bug/1670666 Fixes: b89d25e5694b ("ovn: Modify the DHCPv4 router option to optional") Signed-off-by: Russell Bryant <russell@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Thu, 13 Apr 2017 17:47:55 +0000 (10:47 -0700)]
debian, xenserver: Update logrotate config to match RHEL.
Commit 618a5b45ae8b ("rhel: Avoid logrotate error if /var/run/openvswitch
does not exist") updated the RHEL logrotate configuration. This commit
makes similar changes for Debian, by synchronizing with the RHEL version.
In particular:
- Indent to match logrotate.conf(5) examples.
- Use "sharedscripts" flag, because the postrotate script only needs to
run once regardless of the number of rotations.
- Drop "delaycompress", because the postrotate script does make daemons
reopen their log files.
- Ignore errors calling vlog/reopen.
Also make similar changes to the xenserver logrotate script. I confirmed
via Twitter that the xenserver packaging still has users.
Kevin Traynor [Mon, 24 Apr 2017 17:48:34 +0000 (18:48 +0100)]
docs: Add some detail about dpdk-socket-mem.
Using dpdk-socket-mem to allocate memory for some NUMA nodes
but leaving blank for subsequent ones is equivalent of assigning
0 MB memory to those subsequent nodes. Document this behavior.
Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
ofproto: Add support of OFPR_PACKET_OUT as packet-in reason
This patch adds support of OFPR_PACKET_OUT as the packet-in reason.
This packet-in reason is a required feature for OF1.4+, and it indicates
that the associated packet-in message to the controller is triggered when
the switch is processing a packet-out message. This reason code is enabled
by default when OF1.4+ is used.
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Starting from OpenFlow 1.4+, OFPR_ACTION is split into four more descriptive
reasons, OFPR_APPLY_ACTION, OFPR_ACTION_SET, OFPR_GROUP, and OFPR_PACKET_OUT.
OVS maintains the new reason code internally, and it currently supports the
first three reason code. If the version of an established OpenFlow connection
is less than 1.4, OVS converts the internal reason code back to OFPR_ACTION to
be backward compatible. However, the internal packet-in reason code mask is
not properly maintained for the older OpenFlow version that may emit the
packet-in messages wth the new reason code. It is because OVS does not enable
the new reason code internally in the reason code mask for older OpenFlow
version. This commit tries to address the aforementioned issue.
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
In testcase "2319: ovn-nbctl - NATs", valgrind reports a memory leak with
the following code stack.
xmalloc (util.c:112)
xvasprintf (util.c:176)
xasprintf (util.c:272)
nbctl_lr_nat_list (ovn-nbctl.c:2400)
do_nbctl (ovn-nbctl.c:3121)
main (ovn-nbctl.c:142)
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Guoshuai Li [Mon, 24 Apr 2017 04:33:51 +0000 (12:33 +0800)]
ovn-detrace: Add ovn-detrace to fedora rpm package.
Otherwise, through the "make rpm-fedora" build fedora package will be error:
error: Installed (but unpackaged) file(s) found:
/usr/bin/ovn-detrace
/usr/share/man/man1/ovn-detrace.1.gz
Signed-off-by: Guoshuai Li <ligs@dtdream.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
In testcase "ofproto-dpif - VLAN handling", valgrind reports a memory
leak with the following call stack.
xcalloc (util.c:95)
bitmap_allocate (bitmap.h:51)
vlan_bitmap_from_array (vlan-bitmap.c:32)
port_configure (bridge.c:983)
bridge_reconfigure (bridge.c:682)
bridge_run (bridge.c:2993)
main (ovs-vswitchd.c:111)
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
rstp/stp: Unref the rstp/stp when bridges destroyed.
When bridges destroyed, which stp enabled, you can
still get stp info via the command 'ovs-appctl stp/show'.
And the rstp is also in the same case. We should unref
them. The rstp/stp ports have been unregistered via
'ofproto_port_unregister' function when ports destroyed.
We will unref rstp/stp struct in the 'destruct' of
ofproto-dpif provider.
Han Zhou [Fri, 31 Mar 2017 23:46:22 +0000 (16:46 -0700)]
ovn-sbctl: fix lflow-list when uuid has leading 0s.
When uuid starts with 0s, lflow-list will fail if leading 0s are
not included in command argument. This leads to unexpected results
considering that leading 0s are usually not shown up in cookies
of OpenFlow outputs of tools such as ovs-ofctl dump-flows
and ovs-appctl ofproto/trace. E.g.
Open vSwitch datapath recirculates packets for tunneling, i.e. the
incoming packets are encapsulated at first pass. Further actions are
applied on encapsulated packets on the second pass after
recirculating. The proposed patch compute and append the post tunnel
actions at the time of translation itself instead of recirculating at
datapath. These actions are solely depends on tunnel attributes so
there is no need of datapath recirculation. By avoiding the
recirculation at datapath, the patch offers up to 30% performance
improvement for VXLAN tunneling in our testing. The action execution
logic is using the new CLONE action to define the packet cloning when
the actions are combined. The length in the CLONE action specifies
the size of nested action set.
It also fixing the testsuite failures that are introduced by nested
CLONE action in tunneling.
Signed-off-by: Sugesh Chandran <sugesh.chandran@intel.com> Signed-off-by: Zoltán Balogh <zoltan.balogh@ericsson.com> Co-authored-by: Zoltán Balogh <zoltan.balogh@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Han Zhou [Fri, 24 Mar 2017 06:43:25 +0000 (23:43 -0700)]
ovn-northd: Add hint in lflow to link back to acl
It will be helpful for trouble-shooting if we can link a logical flow
back to the ACL that generated it. This patch is to add a stage-hint as
an external-id in lflow. The hint contains stage specific information.
Now only lflows in ACL stages have hint, which is the ACL uuid, though
the same mechanism can be used to add hint for other stages later.
Signed-off-by: Han Zhou <zhouhan@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Stephen Finucane [Tue, 18 Apr 2017 10:26:49 +0000 (11:26 +0100)]
doc: Don't override default theme
Sphinx 1.3 renamed the 'default' theme to 'classic' and configured the
'alabaster' theme as the new default. To prevent breaking existing
builds, the 'default' name was reserved as an alias for 'classic' [1].
However, initially this raised a warning [1] with a message to use
'classic' instead. This warning was removed in 1.3.2 [2], but it will
result in errors (due to the use of the '-W' flag) for Sphinx 1.3.0 and
1.3.1 users.
Mitigate the issue by not setting a theme if the 'ovs_sphinx_theme'
package is absent. This will result in Sphinx using its default theme,
be that 'classic' (Sphinx < 1.3) or 'alabaster'.