Daniel Alvarez [Wed, 28 Feb 2018 09:11:09 +0000 (10:11 +0100)]
python: avoid useless JSON conversion to enhance performance
This patch removes a useless conversion to/from JSON in the
processing of any 'modify' operations inside the process_update2
method in Python IDL implementation.
Previous code will make resources creation take longer as the number
of elements in the row grows because of that JSON conversion. This
patch eliminates it and now the time remains consant regardless
of the database contents improving performance and scaling.
Reported-by: Daniel Alvarez <dalvarez@redhat.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2018-February/046263.html Signed-off-by: Daniel Alvarez <dalvarez@redhat.com> Acked-by: Terry Wilson <twilson@redhat.com> Tested-By: Terry Wilson <twilson@redhat.com> Acked-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Jakub Sitnicki [Wed, 28 Feb 2018 16:06:45 +0000 (17:06 +0100)]
Fix type-setting in ovsdb-idlc man page.
- Remove extra escape sequences for switching to bold font.
- Add missing escape sequences for switching back to normal font.
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2018-February/344591.html Signed-off-by: Jakub Sitnicki <jkbs@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Greg Rose [Mon, 26 Feb 2018 22:10:15 +0000 (14:10 -0800)]
compat: Fix RHEL 7 compile
frag_percpu_counter_batch is a variable, not a define, so checking if
it is defined is an error and causes warning messages during compile
on RHEL 7 (or other 3.10 based) builds. Use a compat #define from
acinclude.m4 instead.
Fixes: 64d8cb7295 ("compat:inet_frag.h: Check for frag_percpu_counter_batch") Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
Balazs Nemeth [Mon, 26 Feb 2018 09:10:35 +0000 (09:10 +0000)]
tests: Make packet-type-aware.at hash independent
When compiling with -msse4.2 a test case of packet-type-aware.at will
fail due to the CRC32 based hash function is different from mhash.
Fix this issue with parsing the port statistics one-by-one.
Signed-off-by: Balazs Nemeth <balazs.nemeth@ericsson.com> CC: Jan Scheurich <jan.scheurich@ericsson.com> CC: Zoltan Balogh <zoltan.balogh@ericsson.com> Fixes: 00135b869d7c ("xlate: fix xport lookup for recirc") Signed-off-by: Ben Pfaff <blp@ovn.org>
Mark Michelson [Mon, 26 Feb 2018 20:04:02 +0000 (14:04 -0600)]
Refer to database manpages in *ctl manpages
The ovn-nbctl, ovn-sbctl, and ovs-vsctl manpages are inconsistent in
their "Database Commands" section when it comes to referring to what
database tables exist. This commit amends this by making each *ctl
manpage reference the corresponding database manpage instead.
To aid in having a more handy list, the --help text of ovn-nbctl,
ovn-sbctl, and ovs-vsctl have been modified to list the available
tables. This is also referenced in the manpages for those applications.
Signed-off-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Fri, 23 Feb 2018 21:03:07 +0000 (13:03 -0800)]
ovn-northd: Consistently use Datapath_Binding UUID for hashing flows.
In one place, ovn-northd was hashing Logical_Switch or Logical_Router UUIDs
for ovn_lflow, and in another place it was hashing Datapath_Binding UUIDs.
This caused problems. This commit changes ovn-northd to always hash the
Datapath_Binding UUID.
Jacob Sitnicki reported the following performance improvement for a similar
fix:
Ilya Maximets [Mon, 26 Feb 2018 08:10:11 +0000 (11:10 +0300)]
ofproto-dpif-upcall: Fix using uninitialized fitness.
'upcall_xlate()' makes a decision to compose slow path actions
by checking the 'upcall->fitness', which is not initialized in
case of calling from the 'upcall_cb()'.
'upcall_cb()' receives the real flow, so the fitness should be
initialized as perfect.
zhangliping [Sat, 24 Feb 2018 03:30:58 +0000 (11:30 +0800)]
vlog: fix the incorrect zero padding in format_log_message
If the format specifier does not have the 0 flag, we should pad with
blanks instead of zeroes.
Signed-off-by: zhangliping <zhangliping02@baidu.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mark Michelson <mmichels@redhat.com> Tested-by: Mark Michelson <mmichels@redhat.com>
Ben Pfaff [Fri, 23 Feb 2018 22:03:15 +0000 (14:03 -0800)]
ofproto: Make ofproto_port_open_type() faster.
ofproto_port_open_type() was surprisingly slow because it called the
function ofproto_class_find__(), which itself was surprisingly slow because
it actually creates a set of strings and enumerates all of the available
classes.
This patch improves performance by eliminating the call to
ofproto_class_find__() from ofproto_port_open_type(). In turn that
required changing a parameter type and updating all the callers.
Possibly it would be worth making ofproto_class_find__() itself faster,
but it doesn't look like any of its other callers would be used in inner
loops.
For more background, see also
https://mail.openvswitch.org/pipermail/ovs-discuss/2018-February/046140.html
This patch arises as a result of testing done by Ali Ginwala and Han Zhou.
Their test showed that commit 2d4beba resulted in slower performance of
ovs-vswitchd than was seen in previous versions of OVS.
With this patch, Ali retested and reported that performance drastically
improved by ~60%. The test for 10k lports, 40 LSs and 8 LRs and 1k HVs just
got completed in 3 hours 39 min vs 8+ hours for branch-2.9. Cpu utilization
graph of a farm comparing Ben's ofproto patch vs branch-2.9 is available @
https://raw.githubusercontent.com/noah8713/ovn-scale-test/scale_results/results/ovs_2.9_vs_ben_ofproto.png
Reported-by: Mark Michelson <mmichels@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Tested-by: aginwala <aginwala@asu.edu> Signed-off-by: Ben Pfaff <blp@ovn.org>
Aaron Conole [Mon, 19 Feb 2018 14:55:43 +0000 (09:55 -0500)]
selinux: allow dpdkvhostuserclient sockets with newer libvirt
Newer libvirt and openstack versions will now label the unix socket as
an `svirt_tmpfs_t` object. This means that in order to support
deploying with the recommended configuration (using a
dpdkvhostuserclient socket), additional permissions need to be
installed as part of the selinux policy.
Ilya Maximets [Wed, 21 Feb 2018 13:32:39 +0000 (16:32 +0300)]
ofp-parse: Include missing ofp-actions.h.
This fixes MacOS build:
lib/ofp-parse.c:167:16:
error: use of undeclared identifier 'IPPORT_FTP'
lib/ofp-parse.c:171:16:
error: use of undeclared identifier 'IPPORT_TFTP'
CC: Ben Pfaff <blp@ovn.org> Fixes: 0d71302e36c4 ("ofp-util, ofp-parse: Break up into many separate modules.") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Wed, 14 Feb 2018 18:14:02 +0000 (10:14 -0800)]
ovn-northd: Reduce amount of flow hashing.
Jakub Sitnicki demonstrated that repeatedly calculating row hashes is
expensive, so this should improve ovn-northd performance.
Reported-by: Jakub Sitnicki <jkbs@redhat.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2018-February/344404.html Signed-off-by: Ben Pfaff <blp@ovn.org> Tested-by: Jakub Sitnicki <jkbs@redhat.com>
Ben Pfaff [Fri, 7 Oct 2016 16:47:43 +0000 (09:47 -0700)]
ovsdb-idlc: Implement synthetic columns.
A synthetic column is one that is not present in the actual database but
instead calculated by code in the client based on columns in the row. This
can be useful to avoid repeatedly calculating the same function of a row.
Ben Pfaff [Fri, 7 Oct 2016 20:35:29 +0000 (13:35 -0700)]
ovsdb-idlc: Add infrastructure for IDL schema extensions.
An IDL schema is an OVSDB schema with some extra stuff in it. So far, all
of the extras have been at the top level. This commit makes it possible
for IDL schemas to have extra information at the table and column levels as
long as it is in an "extensions" member.
Ben Pfaff [Wed, 7 Sep 2016 22:23:44 +0000 (15:23 -0700)]
ovsdb-idlc: Add "cDecls" and "hDecls" IDL schema extensions.
An IDL schema is an OVSDB schema with some extra stuff in it: an idlPrefix
and an idlHeader at the top level to indicate what ovsdb-idlc needs to
generate the interface definitions. This commit adds support for two more
optional IDL schema extensions that allow extra code to be written to the
.c and .h file that ovsdb-idlc generates.
openvswitch: Remove padding from packet before L3+ conntrack processing
IPv4 and IPv6 packets may arrive with lower-layer padding that is not
included in the L3 length. For example, a short IPv4 packet may have
up to 6 bytes of padding following the IP payload when received on an
Ethernet device with a minimum packet length of 64 bytes.
Higher-layer processing functions in netfilter (e.g. nf_ip_checksum(),
and help() in nf_conntrack_ftp) assume skb->len reflects the length of
the L3 header and payload, rather than referring back to
ip_hdr->tot_len or ipv6_hdr->payload_len, and get confused by
lower-layer padding.
In the normal IPv4 receive path, ip_rcv() trims the packet to
ip_hdr->tot_len before invoking netfilter hooks. In the IPv6 receive
path, ip6_rcv() does the same using ipv6_hdr->payload_len. Similarly
in the br_netfilter receive path, br_validate_ipv4() and
br_validate_ipv6() trim the packet to the L3 length before invoking
netfilter hooks.
Currently in the OVS conntrack receive path, ovs_ct_execute() pulls
the skb to the L3 header but does not trim it to the L3 length before
calling nf_conntrack_in(NF_INET_PRE_ROUTING). When
nf_conntrack_proto_tcp encounters a packet with lower-layer padding,
nf_ip_checksum() fails causing a "nf_ct_tcp: bad TCP checksum" log
message. While extra zero bytes don't affect the checksum, the length
in the IP pseudoheader does. That length is based on skb->len, and
without trimming, it doesn't match the length the sender used when
computing the checksum.
In ovs_ct_execute(), trim the skb to the L3 length before higher-layer
processing.
Signed-off-by: Ed Swierk <eswierk@skyportsystems.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Ed Swierk <eswierk@skyportsystems.com> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
Trivial fix removes unneeded semicolons after if blocks.
This issue was detected by using the Coccinelle software.
Signed-off-by: Christopher DÃaz Riveros <chrisadr@gentoo.org> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Christopher DÃaz Riveros <chrisadr@gentoo.org> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
Preempt counter APIs have been split out, currently, hardirq.h just
includes irq_enter/exit APIs which are not used by openvswitch at all.
So, remove the unused hardirq.h.
Signed-off-by: Yang Shi <yang.s@alibaba-inc.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: dev@openvswitch.org Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Yang Shi <yang.s@alibaba-inc.com> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Yang Shi <yang.shi@linux.alibaba.com>
OVS_NLERR prints a newline at the end of the message string, so the
message string does not need to include a newline explicitly. Done
using Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
openvswitch: Fix pop_vlan action for double tagged frames
skb_vlan_pop() expects skb->protocol to be a valid TPID for double
tagged frames. So set skb->protocol to the TPID and let skb_vlan_pop()
shift the true ethertype into position for us.
Fixes: 5108bbaddc37 ("openvswitch: add processing of L3 packets") Signed-off-by: Eric Garver <e@erig.me> Reviewed-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Eric Garver <e@erig.me> Fixes: a27c454ee0 ("datapath: add processing of L3 packets") Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
openvswitch: do not propagate headroom updates to internal port
After commit 3a927bc7cf9d ("ovs: propagate per dp max headroom to
all vports") the need_headroom for the internal vport is updated
accordingly to the max needed headroom in its datapath.
That avoids the pskb_expand_head() costs when sending/forwarding
packets towards tunnel devices, at least for some scenarios.
We still require such copy when using the ovs-preferred configuration
for vxlan tunnels:
br_int
/ \
tap vxlan
(remote_ip:X)
br_phy
\
NIC
where the route towards the IP 'X' is via 'br_phy'.
When forwarding traffic from the tap towards the vxlan device, we
will call pskb_expand_head() in vxlan_build_skb() because
br-phy->needed_headroom is equal to tun->needed_headroom.
With this change we avoid updating the internal vport needed_headroom,
so that in the above scenario no head copy is needed, giving 5%
performance improvement in UDP throughput test.
As a trade-off, packets sent from the internal port towards a tunnel
device will now experience the head copy overhead. The rationale is
that the latter use-case is less relevant performance-wise.
Signed-off-by: paolo abeni <pabeni@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: paolo abeni <pabeni@redhat.com> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
Guoshuai Li [Thu, 15 Feb 2018 10:52:29 +0000 (18:52 +0800)]
ovn-controller: Fix crash when sending GARP when openflow disconnection.
This is call stack:
Program received signal SIGABRT, Aborted.
1 0x00007ffff6a4f8e8 in __GI_abort () at abort.c:90
2 0x00000000004765d6 in ofputil_protocol_to_ofp_version (protocol=<optimized out>) at lib/ofp-util.c:769
3 0x000000000047c19e in ofputil_encode_packet_out (po=po@entry=0x7fffffffa0e0, protocol=<optimized out>) at lib/ofp-util.c:7060
4 0x0000000000410870 in send_garp (garp=0x83cfe0, current_time=current_time@entry=1200375400) at ovn/controller/pinctrl.c:1738
5 0x000000000041430f in send_garp_run (active_tunnels=<optimized out>, local_datapaths=0x7fffffffc0a0, chassis_index=<optimized out>, chassis=0x8194d0, br_int=<optimized out>, ctx=0x7fffffffc080) at ovn/controller/pinctrl.c:2069
Signed-off-by: Guoshuai Li <ligs@dtdream.com> Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Benli Ye [Thu, 15 Feb 2018 01:52:07 +0000 (17:52 -0800)]
ofproto-dpif-ipfix: Fix an issue in flow key part
As struct ipfix_data_record_flow_key_iface didn't calculate
its length in flow key part, it may cause problem when flow
key part length is not enough. Use MAX_IF_LEN and MAX_IF_DESCR
to pre-allocate memory for ipfix_data_record_flow_key_iface.
Signed-off-by: Daniel Benli Ye <daniely@vmware.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Add suffix LL to constant 1000 in order to give the compiler
complete information about the proper arithmetic to use. Notice
that this constant is used in a context that expects an expression
of type long long int (64 bits, signed).
The expression (band->burst_size + band->rate) * 1000 is currently
being evaluated using 32-bit arithmetic.
Addresses-Coverity-ID: 1461563 ("Unintentional integer overflow") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
openvswitch: meter: fix NULL pointer dereference in ovs_meter_cmd_reply_star
It seems that the intention of the code is to null check the value
returned by function genlmsg_put. But the current code is null
checking the address of the pointer that holds the value returned
by genlmsg_put.
Fix this by properly null checking the value returned by function
genlmsg_put in order to avoid a pontential null pointer dereference.
Addresses-Coverity-ID: 1461561 ("Dereference before null check")
Addresses-Coverity-ID: 1461562 ("Dereference null return value") Fixes: 96fbc13d7e77 ("openvswitch: Add meter infrastructure") Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Gustavo A. R. Silva <garsilva@embeddedor.com> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
openvswitch: Using kfree_rcu() to simplify the code
The callback function of call_rcu() just calls a kfree(), so we
can use kfree_rcu() instead of call_rcu() + callback function.
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
openvswitch: Fix return value check in ovs_meter_cmd_features()
In case of error, the function ovs_meter_cmd_reply_start() returns
ERR_PTR() not NULL. The NULL test in the return value check should
be replaced with IS_ERR().
Fixes: 96fbc13d7e77 ("openvswitch: Add meter infrastructure") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
OVS kernel datapath so far does not support Openflow meter action.
This is the first stab at adding kernel datapath meter support.
This implementation supports only drop band type.
Signed-off-by: Andy Zhou <azhou@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Added a compat layer fixup for nla_parse.
Added another compat fixup for ktime_get_ns.
Cc: Andy Zhou <azhou@ovn.org> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
Meter has its own netlink family. Define netlink messages and attributes
for communicating with the user space programs.
Signed-off-by: Andy Zhou <azhou@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Andy Zhou <azhou@ovn.org> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
openvswitch: reliable interface indentification in port dumps
This patch allows reliable identification of netdevice interfaces connected
to openvswitch bridges. In particular, user space queries the netdev
interfaces belonging to the ports for statistics, up/down state, etc.
Datapath dump needs to provide enough information for the user space to be
able to do that.
Currently, only interface names are returned. This is not sufficient, as
openvswitch allows its ports to be in different name spaces and the
interface name is valid only in its name space. What is needed and generally
used in other netlink APIs, is the pair ifindex+netnsid.
The solution is addition of the ifindex+netnsid pair (or only ifindex if in
the same name space) to vport get/dump operation.
On request side, ideally the ifindex+netnsid pair could be used to
get/set/del the corresponding vport. This is not implemented by this patch
and can be added later if needed.
Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Added compat fixup for peernet2id.
Cc: Jiri Benc <jbenc@redhat.com> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
Later patches will invoke get_dp() outside of datapath.c. Export it.
Signed-off-by: Andy Zhou <azhou@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Andy Zhou <azhou@ovn.org> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
Ben Pfaff [Thu, 8 Feb 2018 21:25:39 +0000 (13:25 -0800)]
Implement OF1.3 extension for OF1.4 role status feature.
ONF extension pack 1 for OpenFlow 1.3 defines how to implement the OpenFlow
1.4 "role status" message in OpenFlow 1.3. This commit implements that
feature.
ONF-JIRA: EXT-191 Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: William Tu <u9012063@gmail.com>
Ben Pfaff [Fri, 9 Feb 2018 18:04:26 +0000 (10:04 -0800)]
ofp-util, ofp-parse: Break up into many separate modules.
ofp-util had been far too large and monolithic for a long time. This
commit breaks it up into units that make some logical sense. It also
moves the pieces of ofp-parse that were specific to each unit into the
relevant unit.
Most of this commit is just moving code around.
Signed-off-by: Ben Pfaff <blp@ovn.org> Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Ansis Atteka [Tue, 16 Jan 2018 00:18:30 +0000 (16:18 -0800)]
poc: Introduce Proof of Concepts (Package building)
This patch sets up foundations for Proof of Concepts that
simply materialize documentation into Ansible instructions
executed in virtualized Vagrant environment.
This Proof of Concept allows to easily build:
1. *.deb packages on Ubuntu 16.04; AND
2. *.rpm packages on CentOS 7.4.
It also sets up DEB and RPM repository over HTTP that can
be used to pull these openvswitch packages with apt-get
or yum from another host.
This particular Proof of Concept is intended to address
following use-cases:
1. for new OVS users to see how debian and rpm packages are
built;
2. for developers to easily check for packaging build
regressions;
3. for developers to easily share their sandbox builds
into QE setups (opposed to manually copying binaries);
4. for developers to add other Proof of Concepts
that possibly may require full end-to-end integration
with other thirdparty projects (e.g. DPI, libvirt, IPsec)
and need Open vSwitch packages.
Tested-by: Greg Rose <gvrose8192@gmail.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Ansis Atteka <aatteka@ovn.org>
openvswitch: use ktime_get_ts64() instead of ktime_get_ts()
timespec is deprecated because of the y2038 overflow, so let's convert
this one to ktime_get_ts64(). The code is already safe even on 32-bit
architectures, since it uses monotonic times. On 64-bit architectures,
nothing changes, while on 32-bit architectures this avoids one
type conversion.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Additional compatability check for ktime_get_ts64() exists or not.
If not, then just continue using ktime_get_ts(). I added a new
compatability header file "timekeeping.h".
Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
openvswitch: fix the incorrect flow action alloc size
If we want to add a datapath flow, which has more than 500 vxlan outputs'
action, we will get the following error reports:
openvswitch: netlink: Flow action size 32832 bytes exceeds max
openvswitch: netlink: Flow action size 32832 bytes exceeds max
openvswitch: netlink: Actions may not be safe on all matching packets
... ...
It seems that we can simply enlarge the MAX_ACTIONS_BUFSIZE to fix it, but
this is not the root cause. For example, for a vxlan output action, we need
about 60 bytes for the nlattr, but after it is converted to the flow
action, it only occupies 24 bytes. This means that we can still support
more than 1000 vxlan output actions for a single datapath flow under the
the current 32k max limitation.
So even if the nla_len(attr) is larger than MAX_ACTIONS_BUFSIZE, we
shouldn't report EINVAL and keep it move on, as the judgement can be
done by the reserve_sfa_size.
Signed-off-by: zhangliping <zhangliping02@baidu.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: zhangliping <zhangliping02@baidu.com> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
net: openvswitch: datapath: fix data type in queue_gso_packets
gso_type is being used in binary AND operations together with SKB_GSO_UDP.
The issue is that variable gso_type is of type unsigned short and
SKB_GSO_UDP expands to more than 16 bits:
SKB_GSO_UDP = 1 << 16
this makes any binary AND operation between gso_type and SKB_GSO_UDP to
be always zero, hence making some code unreachable and likely causing
undesired behavior.
Fix this by changing the data type of variable gso_type to unsigned int.
Addresses-Coverity-ID: 1462223 Fixes: 0c19f846d582 ("net: accept UFO datagrams from tuntap and packet") Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
While backporting this I found another couple of instances of the
same issue so I fixed them up as well.
Cc: Gustavo A. R. Silva <garsilva@embeddedor.com> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
openvswitch: Fix an error handling path in 'ovs_nla_init_match_and_action()'
All other error handling paths in this function go through the 'error'
label. This one should do the same.
Fixes: 9cc9a5cb176c ("datapath: Avoid using stack larger than 1024.") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Fixes: 850c2a4d1a ("datapath: Avoid using stack larger than 1024.") Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
Greg Rose [Wed, 7 Feb 2018 15:30:06 +0000 (07:30 -0800)]
compat: Fix compiler headers
Since Linux kernel upstream commit d15155824c50
("linux/compiler.h: Split into compiler.h and compiler_types.h") this
error check for the gcc compiler header is no longer valid. Remove
so that openvswitch builds for linux kernels 4.14.8 and since.
Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
Greg Rose [Wed, 7 Feb 2018 15:30:03 +0000 (07:30 -0800)]
compat:inet_frag.h: Check for frag_percpu_counter_batch
Fix up the compat layer to check for frag_percpu_counter_batch and
if not present then use atomic_sub and atomic_add as per the
backport in the 3.16.50 LTS kernel.
Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
Mark Michelson [Fri, 9 Feb 2018 15:11:00 +0000 (09:11 -0600)]
ovn: Allow DNS lookups over IPv6
There was a bug in DNS request handling where the incoming packet was
assumed to be IPv4.
The result was that for the outgoing packet, we would attempt to write
the IPv4 checksum and total length into what was actually an IPv6
header. This resulted in the source IPv6 address getting corrupted.
Later, the source and destination IPv6 addresses would get swapped,
resulting in the DNS response being sent to a nonsense destination.
With this change, we check the ethertype of the packet to determine what
l3 information to write, and where to write it. A test is also included
that verifies that this works as expected.
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1539608 Signed-off-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
OVS master and 2.8 branch has merged NSH userspace
patch series, this patch is to enable NSH support
in kernel data path in order that OVS can support
NSH in compat mode by porting this.
Signed-off-by: Yi Yang <yi.y.yang@intel.com> Acked-by: Jiri Benc <jbenc@redhat.com> Acked-by: Eric Garver <e@erig.me> Acked-by: Pravin Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Yi Yang <yi.y.yang@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Add a new nsh/ directory. It currently holds only GSO functions but more
will come: in particular, code shared by openvswitch and tc to manipulate
NSH headers.
For now, assume there's no hardware support for NSH segmentation. We can
always introduce netdev->nsh_features later.
Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Yi Yang <yi.y.yang@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Reviewed-by: Greg Rose <gvrose8192@gmail.com>
NSH (Network Service Header)[1] is a new protocol for service
function chaining, it can be handled as a L3 protocol like
IPv4 and IPv6, Eth + NSH + Inner packet or VxLAN-gpe + NSH +
Inner packet are two typical use cases.
This patch adds NSH header structures and helpers for NSH GSO
support and Open vSwitch NSH support.
[Jiri: added nsh_hdr() helper and renamed the header struct to "struct
nshhdr" to match the usual pattern. Removed packet type defines, these are
now shared with VXLAN-GPE.]
Signed-off-by: Yi Yang <yi.y.yang@intel.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Yi Yang <yi.y.yang@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Reviewed-by: Greg Rose <gvrose8192@gmail.com>
The values are shared between VXLAN-GPE and NSH. Originally probably by
coincidence but I notified both working groups about this last year and they
seem to keep the values in sync since then.
Hopefully they'll get a single IANA registry for the values, too. (I asked
them for that.)
Factor out the code to be shared by the NSH implementation.
NSH and MPLS values are added in this patch, too. For MPLS, the drafts
incorrectly assign only a single value, while we have two MPLS ethertypes.
I raised the problem with both groups. For now, I assume the value is for
unicast.
Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Yi Yang <yi.y.yang@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Reviewed-by: Greg Rose <gvrose8192@gmail.com>
An IEEE EtherType, 0x894F, has been allocated for NSH.
Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Yi Yang <yi.y.yang@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Ben Pfaff [Tue, 6 Feb 2018 18:00:48 +0000 (10:00 -0800)]
expr: Make expr_sort() always yield an expr that satisfies invariants.
Expressions of type EXPR_T_AND are supposed to follow an invariant that
they have at least 2 clauses, but expr_sort() did not always follow that;
for example, applying it to (x[0] == 1 && x[1] == 1) yielded the 1-child
EXPR_T_AND expression x[0..1] == 3. This commit fixes the problem.
I don't know of any externally visible negative consequences for this
problem, but it made the code harder to reason about.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Numan Siddique <nusiddiq@redhat.com>
Ben Pfaff [Fri, 2 Feb 2018 21:09:03 +0000 (13:09 -0800)]
ovs-vsctl: Use default socket name in tests.
By using the default socket name "db.sock", instead of "socket", we can
avoid passing --db=unix:socket to all the ovs-vsctl invocations, which is
kind of nice.
Signed-off-by: Ben Pfaff <blp@ovn.org> Tested-by: Yifeng Sun <pkusunyifeng@gmail.com> Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Ben Pfaff [Fri, 2 Feb 2018 18:23:37 +0000 (10:23 -0800)]
ovs-vsctl: Remove superfluous OVS_VSCTL_CLEANUP from tests.
Since on_exit was introduced a long, long time ago, it has no longer been
necessary to have individual calls to OVS_VSCTL_CLEANUP sprinkled
everywhere in the test code. This change makes the tests easier to read.
Signed-off-by: Ben Pfaff <blp@ovn.org> Tested-by: Yifeng Sun <pkusunyifeng@gmail.com> Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Ben Pfaff [Wed, 24 Jan 2018 19:40:20 +0000 (11:40 -0800)]
odp-util: Always report ODP_FIT_TOO_LITTLE for IGMP.
OVS datapaths don't understand or parse IGMP fields, but OVS userspace
does, so this commit updates odp_flow_key_to_flow() to report that properly
to the caller.
Reported-by: Huanle Han <hanxueluo@gmail.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2018-January/343665.html Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Wed, 24 Jan 2018 19:40:19 +0000 (11:40 -0800)]
ofproto-dpif-upcall: Slow path flows that datapath can't fully match.
In the OVS architecture, when a datapath doesn't have a match for a packet,
it sends the packet and the flow that it extracted from it to userspace.
Userspace then examines the packet and the flow and compares them.
Commonly, the flow is the same as what userspace expects, given the packet,
but there are two other possibilities:
- The flow lacks one or more fields that userspace expects to be there,
that is, the datapath doesn't understand or parse them but userspace
does. This is, for example, what would happen if current OVS
userspace, which understands and extracts TCP flags, were to be
paired with an older OVS kernel module, which does not. Internally
OVS uses the name ODP_FIT_TOO_LITTLE for this situation.
- The flow includes fields that userspace does not know about, that is,
the datapath understands and parses them but userspace does not.
This is, for example, what would happen if an old OVS userspace that
does not understand or extract TCP flags, were to be paired with a
recent OVS kernel module that does. Internally, OVS uses the name
ODP_FIT_TOO_MUCH for this situation.
The latter is not a big deal and OVS doesn't have to do much to cope with
it.
The former is more of a problem. When the datapath can't match on all the
fields that OVS supports, it means that OVS can't safely install a flow at
all, other than one that directs packets to the slow path. Otherwise, if
OVS did install a flow, it could match a packet that does not match the
flow that OVS intended to match and could cause the wrong behavior.
Somehow, this nuance was lost a long time. From about 2013 until today,
it seems that OVS has ignored ODP_FIT_TOO_LITTLE. Instead, it happily
installs a flow regardless of whether the datapath can actually fully match
it. I imagine that this is rarely a problem because most of the time
the datapath and userspace are well matched, but it is still an important
problem to fix. This commit fixes it, by forcing flows into the slow path
when the datapath cannot match specifically enough.
CC: Ethan Jackson <ejj@eecs.berkeley.edu> Fixes: e79a6c833e0d ("ofproto: Handle flow installation and eviction in upcall.") Reported-by: Huanle Han <hanxueluo@gmail.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2018-January/343665.html Signed-off-by: Ben Pfaff <blp@ovn.org>
Shashank Ram [Thu, 1 Feb 2018 02:58:55 +0000 (18:58 -0800)]
datapath-windows: Allow compiling all targets using SDK 10.0
Previously, Win8/8.1 targets would use SDK8.1. However, its
recommended to use the newer SDK as newer VS versions typically
drop support for older SDKs later on. This patch adds support
to compile all targets (Win8/8.1/10) using the 10.0 SDK.
Note that his patch does not drop support for older SDKs.
Signed-off-by: Shashank Ram <rams@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@ovn.org> Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
Tonghao Zhang [Sun, 4 Feb 2018 14:45:38 +0000 (06:45 -0800)]
netdev-linux: Report netdev change events when mac changed.
When mac addr of ports on bridge has been changed, for example,
$ ip link set dev eth0 address 00:11:22:33:44:55
we should reconfigure the datapath id and mac addr of local port.
But now openvswitch dont do that as expected.
A simple example of how to reproduce it:
$ ovs-vsctl add-br br0
$ ifconfig br0 # for example, mac is c6:c6:d7:46:b4:4b
$ ip link set dev br0 address 00:11:22:33:44:55
$ ifconfig br0 # mac of br0 will be 00:11:22:33:44:55
then repeat:
$ ip link set dev br0 address 00:11:22:33:44:55
$ ifconfig br0 # mac of br0 will be c6:c6:d7:46:b4:4b
This patch reports the mac changed event when ports changed, then
openvswitch will reconfigure the datapath id and mac addr of local
port.
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Fri, 2 Feb 2018 23:16:22 +0000 (15:16 -0800)]
util: Use lookup table to optimize hexit_value().
Daniel Alvarez Sanchez reported a significant overall speedup in ovn-northd
due to a similar patch.
Reported-by: Daniel Alvarez Sanchez <dalvarez@redhat.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2018-February/046120.html Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Daniel Alvarez <dalvarez@redhat.com> Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>