Zong Kai LI [Thu, 4 May 2017 15:12:54 +0000 (20:42 +0530)]
lib: rename ovs_nd_opt to ovs_nd_lla_opt
Since ovs_nd_mtu_opt and ovs_nd_prefix_opt is introducted, rename
ovs_nd_opt to ovs_nd_lla_opt to specify it's Source/Target Link-layer
Address Option.
Signed-off-by: Zongkai LI <zealokii@gmail.com> Signed-off-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Zong Kai LI [Thu, 4 May 2017 15:12:30 +0000 (20:42 +0530)]
packets: add compose_nd_ra
This patch introduces methods to compose a Router Advertisement (RA) packet,
introduces flags for RA. RA packet composed structures against specification
in RFC4861.
Caller can use compse_nd_ra_with_sll_mtu_opts to compose a RA packet with
Source Link-layer Address Option and MTU Option.
Caller can use packet_put_ra_prefix_opt to append a Prefix Information Option
to a RA packet.
Signed-off-by: Zongkai LI <zealokii@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
ovs-pki: add option to suppress generated id in common name
For some applications, it is desirable to have full control of
the common name field in generated certificates. Add a command-line
option to suppress appending " id:<uuid-or-date>" to the user-
specified name.
Signed-off-by: Lance Richardson <lrichard@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
ovsdb: refactor utility functions into separate file
Move local db access functions to a new file and make give them
global scope so they can be included in the ovsdb library and used
by other ovsdb library functions.
Signed-off-by: Lance Richardson <lrichard@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Sun, 30 Apr 2017 20:39:00 +0000 (13:39 -0700)]
ovn-trace: Display friendlier port and datapath names.
This makes ovn-trace use short name instead of UUIDs (etc.) in its own
output, by default. Since it's possible that there's software out there
parsing ovn-trace output, it also adds a --no-friendly-names option.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Ben Pfaff [Sun, 30 Apr 2017 21:21:04 +0000 (14:21 -0700)]
ovn-trace: Accept human-friendly logical port and datapath names.
This allows the user to specify these names in a natural way, e.g.
"ovn-trace myswitch 'inport == "myport"' instead of having to specify
whatever UUID or other horrible name the CMS invented.
Simon Horman [Wed, 3 May 2017 14:33:06 +0000 (16:33 +0200)]
tests: Only run python SSL test if SSL support is configured
Only run python SSL test, which invokes ovsdb with a --remote=pssl,
if SSL support is configured.
Without this change the following error appears when running
the test-suite when OVS is configured with --disable-ssl.
+ovsdb-server: Private key specified but Open vSwitch was built without SSL support
./ovsdb-idl.at:1215: exit code was 1, expected 0
Fixes: d90ed7d65ba8 ("python: Add SSL support to the python ovs client library") Signed-off-by: Simon Horman <simon.horman@netronome.com> Acked-by: Ben Pfaff <blp@ovn.org>
When IPv6 is compiled but disabled at runtime, __vxlan_sock_add returns
-EAFNOSUPPORT. For metadata based tunnels, this causes failure of the whole
operation of bringing up the tunnel.
Ignore failure of IPv6 socket creation for metadata based tunnels caused by
IPv6 not being available.
Fixes: b1be00a6c39f ("vxlan: support both IPv4 and IPv6 sockets in a single vxlan device") Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
Jan Scheurich [Tue, 25 Apr 2017 16:29:59 +0000 (16:29 +0000)]
userspace: Add packet_type in dp_packet and flow
This commit adds a packet_type attribute to the structs dp_packet and flow
to explicitly carry the type of the packet as prepration for the
introduction of the so-called packet type-aware pipeline (PTAP) in OVS.
The packet_type is a big-endian 32 bit integer with the encoding as
specified in OpenFlow verion 1.5.
The upper 16 bits contain the packet type name space. Pre-defined values
are defined in openflow-common.h:
enum ofp_header_type_namespaces {
OFPHTN_ONF = 0, /* ONF namespace. */
OFPHTN_ETHERTYPE = 1, /* ns_type is an Ethertype. */
OFPHTN_IP_PROTO = 2, /* ns_type is a IP protocol number. */
OFPHTN_UDP_TCP_PORT = 3, /* ns_type is a TCP or UDP port. */
OFPHTN_IPV4_OPTION = 4, /* ns_type is an IPv4 option number. */
};
The lower 16 bits specify the actual type in the context of the name space.
Only name spaces 0 and 1 will be supported for now.
For name space OFPHTN_ONF the relevant packet type is 0 (Ethernet).
This is the default packet_type in OVS and the only one supported so far.
Packets of type (OFPHTN_ONF, 0) are called Ethernet packets.
In name space OFPHTN_ETHERTYPE the type is the Ethertype of the packet.
A packet of type (OFPHTN_ETHERTYPE, <Ethertype>) is a standard L2 packet
whith the Ethernet header (and any VLAN tags) removed to expose the L3
(or L2.5) payload of the packet. These will simply be called L3 packets.
The Ethernet address fields dl_src and dl_dst in struct flow are not
applicable for an L3 packet and must be zero. However, to maintain
compatibility with the large code base, we have chosen to copy the
Ethertype of an L3 packet into the the dl_type field of struct flow.
This does not mean that it will be possible to match on dl_type for L3
packets with PTAP later on. Matching must be done on packet_type instead.
New dp_packets are initialized with packet_type Ethernet. Ports that
receive L3 packets will have to explicitly adjust the packet_type.
Signed-off-by: Jean Tourrilhes <jt@labs.hpe.com> Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Co-authored-by: Zoltan Balogh <zoltan.balogh@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Sun, 30 Apr 2017 20:53:24 +0000 (13:53 -0700)]
ovn-sbctl: Get rid of redundant code by using function from db-ctl-base.
This renames get_row() to ctl_get_row() and makes it public. It's
unfortunate that it adds a cast, but getting rid of redundant code seems
worth it to me.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Ben Pfaff [Sun, 30 Apr 2017 20:52:11 +0000 (13:52 -0700)]
ovn-sbctl: Allow database commands to refer to datapaths by name.
Until now, only the lflow-list command supported using UUIDs or names
for datapaths. This commit extends that support to all the database
commands, as well as adding support for matching "logical-switch" or
"logical-router" in addition to "name".
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Ben Pfaff [Sun, 30 Apr 2017 05:57:31 +0000 (22:57 -0700)]
ovn-northd: Keep external-ids up-to-date in Datapath_Binding.
Without this, ovn-northd sets external-ids properly when it creates a
Datapath_Binding record, but failed to update the external-ids if they
should have changed.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Ben Pfaff [Sun, 30 Apr 2017 21:24:18 +0000 (14:24 -0700)]
ovn-northd: Propagate Neutron datapath names to southbound database.
It's much easier to see what's going on in the southbound database if
human-friendly names are available.
Really it's too bad that we didn't put the human-friendly name in "name"
and the UUID in something like "external_ids:neutron-uuid", but it'll take
more coordination to change that at this point and it may not be worth it.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Joe Stringer [Wed, 3 May 2017 18:53:29 +0000 (11:53 -0700)]
datapath: Remove untracked CT on newer kernels.
Upstream commits cc41c84b7e7f ("netfilter: kill the fake untracked
conntrack objects") and ab8bc7ed864b ("netfilter: remove
nf_ct_is_untracked") removed the 'untracked' conntrack objects and
functions. The latter commit removes the usage of nf_ct_is_untracked()
from OVS. However, older kernels still have a representation of
'untracked' CT objects so the code needs to remain until the kernel
support is bumped to Linux 4.12 or newer. Introduce a macro to detect
this symbol and wrap these lines in the macro check.
Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Greg Rose <gvrose8192@gmail.com>
Ben Pfaff [Wed, 3 May 2017 18:05:53 +0000 (11:05 -0700)]
ovs-atomic: Report error for contradictory configuration.
A user reported that GCC 5.x was using the atomic fallback for GCC 4.x
because the test
#elif __GNUC__ >= 4 && __GNUC_MINOR__ >= 7
didn't include GCC 5. However, GCC 5+ has <stdatomic.h> and shouldn't use
any of the GCC-specific cases at all. I think that this user was actually
pulling our atomics out into third-party code that probably didn't define
HAVE_STDATOMIC_H properly, so this commit both avoids that problem for
them in the future and clarifies the intent of the ovs-atomic header.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
Andy Zhou [Tue, 25 Apr 2017 01:55:04 +0000 (18:55 -0700)]
vswitchd: Add --cleanup option to the 'appctl exit' command
'appctl exit' stops the running vswitchd daemon, without releasing
the datapath resources (such as bridges and ports) that vswitchd
has created. This is expected when vswitchd is to be relaunched, to
reduce the perturbation of exiting traffic and connections.
However, when vswitchd is intended to be shutdown permanently, it
is desirable not to leak datapath resources. In theory, this can be
achieved by removing the corresponding configurations from
OVSDB before shutting down vswitchd. However it is not always
possible in practice. Sometimes it is convenient and robust for
vswitchd to release all datapath resources that it has configured.
Add 'appctl exit --cleanup' option for this use case.
Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
DPDK 16.07 introduced the support for mempool offload support.
rte_pktmbuf_pool_create is the recommended method for creating pktmbuf
pools. Buffer pools created with rte_mempool_create may not get offloaded
to the underlying offloaded mempools.
This patch, changes the rte_mempool_create to use helper wrapper
"rte_pktmbuf_pool_create" provided by dpdk, so that it can leverage
offloaded mempools.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com> Acked-by: Jianbo Liu <jianbo.liu@linaro.org> Acked-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Billy O'Mahony [Wed, 1 Mar 2017 12:36:43 +0000 (12:36 +0000)]
netdev-dpdk: Enable INDIRECT_DESC on DPDK vHostUser.
This gives much better performance for linux apps in the guest without
affecting dpdk applications in the guest.
I'm creating this patch on the basis of performance results outlined below.
In summary it appears that enabling INDIRECT_DESC on DPDK vHostUser ports
leads to very large increase in performance when using linux stack
applications in the guest with no noticable performance drop for DPDK based
applications in the guest.
Test#1 (VM-VM iperf3 performance)
VMs use DPDK vhostuser ports
OVS bridge is configured for normal action.
OVS version 603381a (on 2.7.0 branch but before release,
also seen on v2.6.0 and v2.6.1)
DPDK v16.11
QEMU v2.5.0 (also seen with v2.7.1)
Test#2 (Phy-VM-Phy RFC2544 Throughput)
DPDK PMDs are polling NIC, DPDK loopback app running in guest.
OVS bridge is configured with port forwarding to VM (via dpdkvhostuser ports).
OVS version 603381a (on 2.7.0 branch but before release),
other versions not tested.
DPDK v16.11
QEMU v2.5.0 (also seen with v2.7.1)
Mark Kavanagh [Mon, 13 Mar 2017 11:35:26 +0000 (11:35 +0000)]
netdev-dpdk: fix mempool_configure error state
netdev_dpdk_mempool_configure obtains a handle to a
DPDK memory pool via a call to dpdk_mp_get. If dpdk_mp_get
fails, the former informs the user that insufficient memory
is available, and returns ENOMEM. However, this is
potentially misleading, as there are a number of reasons why
creation of a mempool can fail (as per rte_mempool_create),
including:
- insufficient memory available
- mempool already exists
- other memory allocation error
Update the error log to reflect this fact, and return rte_errno
in the event of error, instead of ENOMEM.
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Fixes: 0072e931 ("netdev-dpdk: add support for jumbo frames") Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Ian Stokes <ian.stokes@intel.com> Acked-by: Darrell Ball <dlu998@gmail.com>
Han Zhou [Tue, 2 May 2017 20:22:35 +0000 (13:22 -0700)]
ovn-controller: Disable probes by default for unix sockets.
Normally the OVS JSON-RPC library does not probe idle connections across
Unix domain sockets, since the kernel can tell OVS whether the connections
are truly connected without probes, but ovn-controller carelessly
overrode that.
(This should not be an issue in typical OVN deployments, because the OVN SB
database is normally accessed via TCP or SSL.)
CC: Nirapada Ghosh <nghosh@us.ibm.com> Fixes: 715038b6b222 ("ovn-controller: reload configured SB probe timer") Signed-off-by: Han Zhou <zhouhan@gmail.com> Co-authored-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
Yi-Hung Wei [Mon, 1 May 2017 17:24:35 +0000 (10:24 -0700)]
system-traffic: Add test for mpls actions
Add ping test to verify the behavior of mpls_push/pop actions. In this
test, we use the resubmit action to trigger recirulation for making sure
the flow key is revalidated after mpls_push/pop. This test depends on
commit 5ba0c107c51e ("datapath: Fix ovs_flow_key_update()") to behave
correctly.
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Acked-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
ovs_flow_key_update() is called when the flow key is invalid, and it is
used to update and revalidate the flow key. Commit 329f45bc4f19
("openvswitch: add mac_proto field to the flow key") introduces mac_proto
field to flow key and use it to determine whether the flow key is valid.
However, the commit does not update the code path in ovs_flow_key_update()
to revalidate the flow key which may cause BUG_ON() on execute_recirc().
This patch addresses the aforementioned issue.
Fixes: 329f45bc4f19 ("openvswitch: add mac_proto field to the flow key") Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Acked-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Acked-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
openvswitch: correctly fragment packet with mpls headers
If mpls headers were pushed to a defragmented packet, the refragmentation no
longer works correctly after 48d2ab609b6b ("net: mpls: Fixups for GSO"). The
network header has to be shifted after the mpls headers for the
fragmentation and restored afterwards.
Fixes: 48d2ab609b6b ("net: mpls: Fixups for GSO") Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
openvswitch: mpls: set network header correctly on key extract
After the 48d2ab609b6b ("net: mpls: Fixups for GSO"), MPLS handling in
openvswitch was changed to have network header pointing to the start of the
MPLS headers and inner_network_header pointing after the MPLS headers.
However, key_extract was missed by the mentioned commit, causing incorrect
headers to be set when a MPLS packet just enters the bridge or after it is
recirculated.
Fixes: 48d2ab609b6b ("net: mpls: Fixups for GSO") Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
Yi-Hung Wei [Mon, 1 May 2017 17:24:31 +0000 (10:24 -0700)]
datapath: Fixups for MPLS GSO
This patch backports the following two upstream commits to fix MPLS GSO in
ovs datapath. Starting from upstream commit 48d2ab609b6b ("net: mpls: Fixups
for GSO"), the mpls_gso kernel module relies on the fact that
skb_network_header() points to the mpls header and skb_inner_network_header()
points to the L3 header so that it can derive the length of mpls header
correctly, and the upstream commit updates how ovs datapath marks the skb
header when push and pop mpls. However, the old mpls_gso kernel module
assumes that the skb_network_header() points to the L3 header, and the old
mpls_gso kernel module will misbehave if the ovs datapath marks the
skb_network_header() in the new way since it will treat mpls header as the L3
header.
Because of the functional signature of mpls_gso_segment() does not change,
this backport patch uses the new mpls_hdr() to determine if the kernel that
ovs datapath is compiled with has the new or legacy mpls_gso kernel module.
It has been tested on kernel 4.4 and 4.9.
As reported by Lennert the MPLS GSO code is failing to properly segment
large packets. There are a couple of problems:
1. the inner protocol is not set so the gso segment functions for inner
protocol layers are not getting run, and
2 MPLS labels for packets that use the "native" (non-OVS) MPLS code
are not properly accounted for in mpls_gso_segment.
The MPLS GSO code was added for OVS. It is re-using skb_mac_gso_segment
to call the gso segment functions for the higher layer protocols. That
means skb_mac_gso_segment is called twice -- once with the network
protocol set to MPLS and again with the network protocol set to the
inner protocol.
This patch sets the inner skb protocol addressing item 1 above and sets
the network_header and inner_network_header to mark where the MPLS labels
start and end. The MPLS code in OVS is also updated to set the two
network markers.
>From there the MPLS GSO code uses the difference between the network
header and the inner network header to know the size of the MPLS header
that was pushed. It then pulls the MPLS header, resets the mac_len and
protocol for the inner protocol and then calls skb_mac_gso_segment
to segment the skb.
Afterward the inner protocol segmentation is done the skb protocol
is set to mpls for each segment and the network and mac headers
restored.
Reported-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream commit:
commit 85de4a2101acb85c3b1dde465e84596ccca99f2c
Author: Jiri Benc <jbenc@redhat.com>
Date: Fri Sep 30 19:08:07 2016 +0200
openvswitch: use mpls_hdr
skb_mpls_header is equivalent to mpls_hdr now. Use the existing helper
instead.
Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
Ben Pfaff [Sun, 30 Apr 2017 21:03:02 +0000 (14:03 -0700)]
ovn-nbctl: Display and accept Neutron network, router, port names.
The names of these neutron:* keys in external_ids are unfortunate, but
they are the keys that the OVN utilities need to support if we want users
to be able to work with OpenStack in a convenient fashion rather than
having to cut and paste UUIDs everywhere.
This commit documents the meaning of these keys, in the hopes that other
CMS integrations will simply use them instead of inventing new ones.
Perhaps at some point we can clean this up, since bad names are a bad idea,
but it also would take a lot of coordination and probably multiple
releases.
Port names are slightly less useful in practice than switch or router names
because Neutron doesn't by default give names to ports. (You can add them
with "openstack port set --name", though.)
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Ben Pfaff [Thu, 27 Apr 2017 22:47:59 +0000 (15:47 -0700)]
db-ctl-base: Add support for identifying a row based on a value in a map.
This will be used in an upcoming commit to allow Datapath_Binding records
in the OVN southbound database to be identified based on external-ids:name
and other map values.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Ben Pfaff [Thu, 27 Apr 2017 20:33:12 +0000 (13:33 -0700)]
ovn-sbctl, ovn-nbctl, ovs-vsctl: Remove useless record id methods.
These only did anything if both the first two members of the struct were
nonnull, as you can see from the first test in get_row_by_id() in
lib/db-ctl-base.c, so these never did anything useful and I can't figure
out why they're there.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Russell Bryant <russell@ovn.org>
Ben Pfaff [Thu, 27 Apr 2017 16:36:36 +0000 (09:36 -0700)]
ovn-nbctl: Drop gratuitous indentation for "show" output.
"ovn-nbctl show" indented every line of output by at least 4 spaces, which
needlessly wastes horizontal space. This drops 4 spaces of indent from
each line of output.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org> Acked-by: Russell Bryant <russell@ovn.org>
Ben Pfaff [Sun, 30 Apr 2017 21:09:55 +0000 (14:09 -0700)]
uuid: Change semantics of uuid_is_partial_string().
Until now, uuid_is_partial_string() returned the number of characters at
the beginning of a string that were the beginning of a valid UUID. This
is useful, but all of the callers actually wanted to get a value of 0 if
the string contained a character that was invalid for a UUID. This makes
that change.
Examples:
"123" previously yielded 3 and still does.
"xyzzy" previously yielded 0 and still does.
"123xyzzy" previously yielded 3, now yields 0.
"e66250bb-9531-491b-b9c3-5385cabb0080" previously yielded 36, still does.
"e66250bb-9531-491b-b9c3-5385cabb0080xyzzy" previously yielded 36, now 0.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org> Acked-by: Russell Bryant <russell@ovn.org>
fedora: do not restart ovn svcs automatically on pkg upgrade
Similar to commit 5771f4765734 ("fedora: do not restart the
service on a pkg upgrade"), this change eliminates the
automatic restart of OVN services after upgrade.
Note that the post-uninstall scriptlet affected by this change
is executed from the previously installed package when upgrading,
so existing installations need to go through two package upgrades
before this change will take effect.
Signed-off-by: Lance Richardson <lrichard@redhat.com> Signed-off-by: Russell Bryant <rbryant@redhat.com>
Russell Bryant [Fri, 31 Mar 2017 15:27:23 +0000 (11:27 -0400)]
build: Don't run tests in rpm makefile targets.
The RPM build makefile targets are helpful during development and testing,
but I personally almost never want the tests to run when I use them.
Leave tests on by default in the spec file for when the package is built by
distro build systems, but disable it by default in the Makefile targets and
update the documentation accordingly.
Joe Stringer [Mon, 1 May 2017 19:58:06 +0000 (12:58 -0700)]
revalidator: Revalidate ukeys created from flows.
If there is no active ukey for a particular datapath flow, and it is
dumped from the datapath, then the revalidator threads will assemble a
ukey based on the datapath flow. This will allow tracking of the stats
for proper attribution, and future validation of the flow.
However, until now when creating the ukey in this context, the ukey's
'reval_seq' has been set to the current udpif's reval_seq. This implies
that the flow has been validated against the current flow table.
However, this is not true - The flow appeared in the datapath without
any prior knowledge in this OVS instance so we should set up the
reval_seq of the ukey to ensure that the flow will be validated during
the current dump/revalidation cycle.
Refer also revalidate_ukey().
Fixes: 23597df05226 ("upcall: Create ukeys in handler threads.") Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
ovn-northd: Add logical flows to support native DNS
OVN implements native DNS resolution which can be used to resolve the
internal DNS names belonging to a logical datapath.
To support this, a new table 'DNS' is added in the NB DB. A new column
'dns_records' is added in 'Logical_Switch' table which references to the
'DNS' table.
Following flows are added for each logical switch if configured with
DNS records in the 'dns_records' column
- A logical flow in DNS_LOOKUP stage which uses the action 'dns_lookup'
to transform the DNS query to DNS reply packet and advances
to the next stage - DNS_RESPONSE.
- A logical flow in DNS_RESPONSE stage which implements the DNS responder
by sending the DNS reply from previous stage back to the inport.
This patch adds a new OVN action 'dns_lookup' to support native DNS.
ovn-controller parses this action and adds a NXT_PACKET_IN2
OF flow with 'pause' flag set.
A new table 'DNS' is added in the SB DB to look up and resolve
the DNS queries. When a valid DNS packet is received by
ovn-controller, it looks up the DNS name in the 'DNS' table
and if successful, it frames a DNS reply, resumes the packet
and stores 1 in the 1-bit subfield. If the packet is invalid
or cannot be resolved, it resumes the packet without any
modifications and stores 0 in the 1-bit subfield.
reg0[4] = dns_lookup(); next;
An upcoming patch will use this action and adds logical flows.
Aaron Conole [Tue, 2 May 2017 20:17:48 +0000 (16:17 -0400)]
rhel: fix the fedora spec
When commit d0c961a99f57 ("lib/automake.mk: don't install
runtime directories") landed, it broke RPM based builds since
the requisite directories were no longer available. This commit
adds those directories back when making RPMs so that the package
manager can see them.
Ben Pfaff [Mon, 1 May 2017 20:19:43 +0000 (13:19 -0700)]
ovs-macros: Add helper to make 'wc' use POSIX compliant output format.
Several times, we've had to fix tests that used 'wc' and expected a
particular output format. POSIX is specific about the output format, but
neither GNU or BSD wc honors it. This commit makes whatever 'wc' is on
the system use the POSIX output format.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: YAMAMOTO Takashi <yamamoto@ovn.org>
Han Zhou [Sat, 22 Apr 2017 01:55:27 +0000 (18:55 -0700)]
ovn-controller: Avoid recomputing when there are in-flight msgs.
When there are in-flight msgs being sent to OVS, ofctrl_put will
skip, which makes all the flows computed in that main loop
iteration useless. To avoid the wasted CPU cycles, a check is added
before lflow/physical flow run in each iteration.
This has huge performance improvement in below testing:
- 1 lswitch with 10 lports bound locally
- Each lport has an ingress ACL, referencing the same address-set
- The address-set has 10,000 IPv4 addresses
For each IP address in the address-set, there will be 3
OpenFlow rules generated for each ACL. So the total number
of rules is 300k+.
Without the patch, it takes 50+ minutes to install all the
rules to ovs-vswitchd.
With the patch, it takes 16 seconds to install all the rules
to ovs-vswitchd.
The reason is that the large number of rules are sent to
ovs-vswitchd gradually in many iterations of ovn-controller
main loop. Without the patch, cpu cycles are wasted in
lflow_run to re-processing the large address set in every
main loop iteration. With the patch, this re-processing is
avoided in iterations when there are pending rules sending.
Signed-off-by: Han Zhou <zhouhan@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Aaron Conole [Mon, 1 May 2017 20:14:09 +0000 (16:14 -0400)]
checkpatch: fix pointer declaration
A common way of expressing 'raise to the power of' when authoring
comments uses **. This is currently getting caught by the pointer
spacing warning. So, catch it here.
Aaron Conole [Mon, 1 May 2017 20:14:08 +0000 (16:14 -0400)]
checkpatch: filename from hunks fix
Filenames that come from the hunks match include the git-ified 'b/'
prefix, which makes jumping to the error file that much harder. This
patch corrects that by simply skipping those bytes.
Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Aaron Conole [Mon, 1 May 2017 20:14:07 +0000 (16:14 -0400)]
checkpatch: print conformance
Other utilities (notoriously the linux kernel's checkpatch.pl) have a more
standardized form for printing file and lines. With this change, the
template used to print gains two enhancements:
1. Color
2. Conformance with the kernel's version of checkpatch.pl
Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Aaron Conole [Mon, 1 May 2017 20:14:06 +0000 (16:14 -0400)]
checkpatch: correct a parsing issue
Occasionally, characters will be sent which violate the
ascii decoder's sense of propriety. In fact, in-tree there are
a few such files (ex: tests/atlocal.in), and they cause an
exception to be raised when they are encountered.
Set the policy to ignore these cases. This means these bytes are
omitted from the text stream during processing.
Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Aaron Conole [Mon, 1 May 2017 20:14:03 +0000 (16:14 -0400)]
checkpatch: introduce a flexible framework
Developers wishing to add checks to checkpatch sift through an adhoc mess,
currently. The process goes something like:
1. Figure out what to test in the patch
2. Write some code, quickly, that checks for that condition
3. Look through the statemachine to find where the check should go
4. ignore parts of the above and just throw something together
That worked fine for the initial development, but as interesting new tests
are developed, it is important to have a more flexible framework that lets
a developer just plug in a new test, easily.
This commit brings in a new framework that allows plugging in checks very
quickly. Hook up the line-length test as an initial demonstration.
Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
The Open vSwitch run, log, and DB directories are installed as part of the
normal `make install` process. However, this means they are created with
user and group ownership that may conflict with the desired user. For
example, running `make install` as root will install those files as
root:root, whereas the runtime user desired may be openvswitch:openvswitch.
Since these directories are automatically created as part of the ovs-ctl
command, and with the correct user:group permissions, it makes sense to
delay creation until these directories are actually required.
Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
install-doc: suggest to use ovs-ctl for start/stop
The install documentation guided users to manually start/stop
daemons. This is good information to have, but with the
existence of ovs-ctl, is probably not the best way to start
guiding new users of ovs.
Suggest that users start by running ovs-ctl start, and
document the ability to selectively start/stop the daemons.
The ovs-ctl script is already mentioned a bit in the install
doc, so this just reinforces its use.
Suggested-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
William Tu [Sat, 29 Apr 2017 13:08:43 +0000 (06:08 -0700)]
doc: Fix sphinx reference warning for windows.
Footnote reference 5, 8, and 9 are not referenced in the windws.rst content,
causing the following error:
Warning, treated as error:
/root/ovs/Documentation/topics/windows.rst:506:Footnote [5] is not referenced.
Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
William Tu [Sat, 29 Apr 2017 13:30:59 +0000 (06:30 -0700)]
bridge: Prohibit "default" and "all" bridge name.
Under Linux, when users create bridge named "default" or "all", although
ovs-vsctl fails but vswitchd in the background will keep retrying it,
causing the systemd-udev to reach 100% cpu utilization. The patch prevents
any attempt to create or open a netdev named "default" or "all" because
these two names are reserved on Linux due to
/proc/sys/net/ipv4/conf/ always contains directories by these names.
The reason for high CPU utilization is due to frequent calls into kernel's
register_netdevice function, which will invoke several kernel elements who
has registered on the netdevice notifier chain. And due to creation failed,
OVS wakes up and re-recreate the device, which ends up as a high CPU loop.
VMWare-BZ: #1842388 Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Greg Rose <gvrose8192@gmail.com>
Ben Pfaff [Fri, 17 Mar 2017 20:43:47 +0000 (13:43 -0700)]
travis: Break Mac OS build for format specifier warnings.
Until now, the Travis build for Mac OS X has been configured to ignore
format specifier warnings. These warnings have now been fixed, so this
commit changes such warnings to error.
Suggested-by: Daniele Di Proietto <diproiettod@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
Joe Stringer [Fri, 28 Apr 2017 21:45:21 +0000 (14:45 -0700)]
ofp-actions: Document that learn(limit=0) is no limit.
The documentation was unclear that specifying a limit of 0 is the same
as specifying no limit. Controllers that wish to set a learn limit so
that no more than 0 flows are learned may omit the learn action.
Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
Unconditionally define OVS_CT_EVENT_* macros for the datapath netlink
interface so that we do not need to include platform dependent files.
This fixes the build on non-Linux (and non-Windows) platforms.
Also define a macro for the default set of events set by OVS userspace.
Reported-by: Joe Stringer <joe@ovn.org> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
Andy Zhou [Fri, 28 Apr 2017 21:42:00 +0000 (14:42 -0700)]
test/ofproto: Improve test 'controller action without megaflows'
Commit af7535e7dbeb9 expanded the test to check the output
of meter stats, but without strip out the duration time.
This makes the test sensitive to the speed of
the machine that runs the test. Strip away the timing information
to improve test reliability
Fixes: af7535e7dbeb9 (ofproto: Meter slowpath action when action
upcall meters are configured) Signed-off-by: Andy Zhou <azhou@ovn.org>
Greg Rose [Thu, 27 Apr 2017 23:13:12 +0000 (16:13 -0700)]
compat: Fix build error in kernels 4.10
Use the acinclude.m4 configuration file to check for the net parameter
that was added to the ipv4 and ipv6 frags init functions in the 4.10
Linux kernel to check whether DEFRAG_ENABLE_TAKES_NET should be set and
then check for that at compile time.
This is an alternative solution patch for the issue reported by Raymond
Burkholder and the patch submitted by Guoshuai Li.
[Committer notes]
Squash in "acinclude.m4: Add check for struct net parameter" which
provides the HAVE_DEFRAG_ENABLE_TAKES_NET.
Reported-by: Raymond Burkholder <ray@oneunified.net> CC: Guoshuai Li <ligs@dtdream.com> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Joe Stringer <joe@ovn.org>
Andy Zhou [Tue, 11 Apr 2017 23:10:41 +0000 (16:10 -0700)]
ofproto: Meter slowpath action when action upcall meters are configured
If a slow path action is a controller action, meter it when the
controller meter is configured. For other kinds of slow path actions,
meter it when the slowpath meter is configured.
Note, this patch only considers the meters configuration of the
packet's input bridge, which may not be the same bridge that the
action is generated.
Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
Andy Zhou [Fri, 31 Mar 2017 00:03:08 +0000 (17:03 -0700)]
ofproto-dpif: Add 'meter_ids' to backer
Add 'meter_ids', an id-pool object to manage datapath meter id, i.e.
provider_meter_id.
Currently, only userspace datapath supports meter, and it implements
the provider_meter_id management. Moving this function to 'backer'
allows other datapath implementation to share the same logic.
Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
Andy Zhou [Thu, 30 Mar 2017 22:37:27 +0000 (15:37 -0700)]
ofproto: Store meters using hmap
Currently, meters are stored in a fixed pointer array. It is not
very efficient since the controller, at least in theory, can
pick any meter id (up to the limits to uint32_t), not necessarily
within the lower end of a region, or in close range to each other.
In particular, OFPM_SLOWPATH and OFPM_CONTROLLER meters are specified
at the high region.
Switching to using hmap. Ofproto layer does not restrict
the number of meters that controller can add, nor does it care
about the value of meter_id. Datapth limits the number of meters
ofproto layer can support at run time.
Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
Specify the event mask with CT commit including bits for CT features
exposed at the OVS interface (mark and label changes in addition to
basic creation and destruction of conntrack entries).
Without this any listener of conntrack update events will typically
(depending on system configuration) receive events for each L4 (e.g.,
TCP) state machine change, which can multiply the number of events
received per connection.
By including the new, related, and destroy events any listener of new
conntrack events gets notified of new related and non-related
connections, and any listener of destroy events will get notified of
deleted (typically timed out) conntrack entries.
By including the flags for mark and labels, any listener of conntrack
update events gets notified whenever the connmark or conntrack labels
are changed from the values reported within the new events.
VMware-BZ: #1837218 Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
Jarno Rajahalme [Sat, 11 Mar 2017 00:10:41 +0000 (16:10 -0800)]
tests: ICMP related to original direction test.
Normally ICMP responses are in the reply direction of a conntrack
entry. This test exercises an ICMP response to the original direction
of the conntrack entry.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
openvswitch: Delete conntrack entry clashing with an expectation.
Conntrack helpers do not check for a potentially clashing conntrack
entry when creating a new expectation. Also, nf_conntrack_in() will
check expectations (via init_conntrack()) only if a conntrack entry
can not be found. The expectation for a packet which also matches an
existing conntrack entry will not be removed by conntrack, and is
currently handled inconsistently by OVS, as OVS expects the
expectation to be removed when the connection tracking entry matching
that expectation is confirmed.
It should be noted that normally an IP stack would not allow reuse of
a 5-tuple of an old (possibly lingering) connection for a new data
connection, so this is somewhat unlikely corner case. However, it is
possible that a misbehaving source could cause conntrack entries be
created that could then interfere with new related connections.
Fix this in the OVS module by deleting the clashing conntrack entry
after an expectation has been matched. This causes the following
nf_conntrack_in() call also find the expectation and remove it when
creating the new conntrack entry, as well as the forthcoming reply
direction packets to match the new related connection instead of the
old clashing conntrack entry.
Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action") Reported-by: Yang Song <yangsong@vmware.com> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
Upstream commit 5a8145f7b222 ("netfilter: labels: don't emit ct event
if labels were not changed"), released in Linux 4.7, changed
nf_connlabels_replace() to trigger conntrack event for a label change
only when the labels actually changed. Without this change an update
event is triggered even if the labels already have the values they are
being set to.
There is no way we can detect this functional change from Linux
headers, so provide replacements that work the same for older Linux
releases regardless if a distribution provides backports or not.
VMware-BZ: #1837218 Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
Add a new optional conntrack action attribute OVS_CT_ATTR_EVENTMASK,
which can be used in conjunction with the commit flag
(OVS_CT_ATTR_COMMIT) to set the mask of bits specifying which
conntrack events (IPCT_*) should be delivered via the Netfilter
netlink multicast groups. Default behavior depends on the system
configuration, but typically a lot of events are delivered. This can be
very chatty for the NFNLGRP_CONNTRACK_UPDATE group, even if only some
types of events are of interest.
Netfilter core init_conntrack() adds the event cache extension, so we
only need to set the ctmask value. However, if the system is
configured without support for events, the setting will be skipped due
to extension not being found.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Joe Stringer <joe@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
Joe Stringer [Thu, 27 Apr 2017 01:03:12 +0000 (18:03 -0700)]
revalidator: Improve logging for transition_ukey().
There are a few cases where more introspection into ukey transitions
would be relevant for logging or assertion. Track the SOURCE_LOCATOR and
thread id when states are transitioned and use these for logging.
Suggested-by: Jarno Rajahalme <jarno@ovn.org> Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Joe Stringer [Thu, 27 Apr 2017 01:03:11 +0000 (18:03 -0700)]
revalidator: Avoid assert in transition_ukey().
There is a case where a flow is dumped from the kernel after the ukey is
already transitioned into an EVICTING/EVICTED/DELETED state, and the
revalidator thread attempts to shift that into UKEY_OPERATIONAL because
it was able to dump the flow from the datapath. This resulted in
triggering the assert in transition_ukey(). Detect this condition and
skip handling the flow (as it's already on its way out).
Users report:
> Program terminated with signal SIGABRT, Aborted.
> raise () from /lib/x86_64-linux-gnu/libc.so.6
> raise () from /lib/x86_64-linux-gnu/libc.so.6
> abort () from /lib/x86_64-linux-gnu/libc.so.6
> ovs_abort_valist
> vlog_abort_valist
> vlog_abort
> ovs_assert_failure
> transition_ukey (ukey=<optimized out>, dst=<optimized out>)
> at ofproto/ofproto-dpif-upcall.c:1674
> revalidate (revalidator=0x1cb36c8) at ofproto/ofproto-dpif-upcall.c:2324
> udpif_revalidator (arg=0x1cb36c8) at ofproto/ofproto-dpif-upcall.c:901
> ovsthread_wrapper (aux_=<optimized out>) at lib/ovs-thread.c:348
> start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
> clone () from /lib/x86_64-linux-gnu/libc.so.6
VMware-BZ: #1857694 Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Russell Bryant [Wed, 19 Apr 2017 16:41:37 +0000 (12:41 -0400)]
ovn: Bump ovn-nb schema version.
Commit b89d25e5694b made the "router" DHCPv4 option optional instead of
mandatory. This did not actually change the schema, but there's no good
way for a client of the northbound database to know if this change is
present without bumping the schema version. This is needed for a client to
work with versions before and after this change.
Reported-at: https://bugs.launchpad.net/networking-ovn/+bug/1670666 Fixes: b89d25e5694b ("ovn: Modify the DHCPv4 router option to optional") Signed-off-by: Russell Bryant <russell@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>