Justin Pettit [Sat, 4 Jun 2016 18:49:46 +0000 (11:49 -0700)]
ovn-northd: Use strings from extract_lsp_addresses().
Extract port security and logical switch port addresses once and store
them as part of the ovn_port structure. Use the string representations
from the extracted addresses.
Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Justin Pettit [Fri, 3 Jun 2016 04:44:38 +0000 (21:44 -0700)]
ovn-util: Add string representations to 'lport_addresses'.
A future commit will reduce the amount of conversions used by the
existing users of 'lport_addresses'. This change will also make it
possible to use this structure for logical router port networks.
Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Justin Pettit [Tue, 17 May 2016 13:02:53 +0000 (06:02 -0700)]
ovn: Remove 'default_gw' from logical router table.
With static routes, it's not necessary to have a separate default
gateway parameter. This also makes configuring router ports clearer
when IPv6 and IPv4 addresses may be assigned to the same port.
Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
References to the specifc tables should probably be dropped, since
they'll continue to drift towards wrong. In the meantime, correct the
ones that are there.
Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Joe Stringer [Fri, 24 Jun 2016 21:15:21 +0000 (14:15 -0700)]
ovs-bugtool: Port to python3.
Fix python2-specific code in ovs-bugtool:
* python2 long() is the same as python2 int() and python3 int(). Convert
the long() to int().
* raw_input() was renamed to input(). Use python-six's input() on python2.
* Drop lambda tuple unpacking, we can go back to regular lambda syntax.
* file() can be replaced with open().
Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Joe Stringer [Mon, 11 Jul 2016 17:29:18 +0000 (10:29 -0700)]
system-traffic: Use NC_EOF_OPT in truncate tests.
NC_EOF_OPT should always be passed to netcat in system-traffic tests
when invoking netcat to send a single packet that does not expect a
response. While on typical fedora/RH based distributions the default
behaviour is to send the packet then return, there are multiple other
implementations of netcat that do not do this (for example, those used
by Debian and Ubuntu by default). For these alternative implementations,
we provide $NC_EOF_OPT to ensure that netcat simply sends the packet
then returns immediately.
Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
datapath: Fix ip tunnel compilation for newer kernel.
compat iptunnel_xmit is used in backported tunnel code. but
it was only defined for kernel older than 3.18, This patch fixes
it by compiling it for all kernel which needs to use backported
tunnel implementation.
Reported-by: Justin Pettit <jpettit@ovn.org> Reported-by: Joe Stringer <joe@ovn.org> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
ovn-controller: Change strategy for gateway conntrack zone allocation.
Commit 263064aeaa31e7 (Convert binding_run to incremental processing.)
changed the way patched_datapaths were handled. Previously we would
destroy the datastructure in every run and re-create it fresh. The new
way causes problems with the way conntrack zones are allocated as now
we can have stale port_binding entries causing segmentation faults.
With this commit, we simply don't depend on port_binding records in
conntrack zone allocation and instead store the UUID as a string in
the patch_datapath datastructure.
(The test enhanced with this commit would fail without the changes
in the commit. i.e. ovn-controller would crash. )
Signed-off-by: Gurucharan Shetty <guru@ovn.org> Acked-by: Ryan Moats <rmoats@us.ibm.com>
Patched datapaths that are no longer referenced should be removed from
the patched_datapaths map; otherwise incorrect state references for a
patched datapath may be used and also datapaths that are absent will be
interpreted as present.
This was already done a long time ago in
commit 64194c31a0b6 ("inet: Make tunnel RX/TX byte counters more consistent")
but tx path was broken (at least since 3.10).
Before the patch the gre header was included on tx.
After the patch:
$ ping -c1 192.168.0.121 ; ip -s l ls dev gre1
PING 192.168.0.121 (192.168.0.121) 56(84) bytes of data.
64 bytes from 192.168.0.121: icmp_req=1 ttl=64 time=2.95 ms
Reported-by: Julien Meunier <julien.meunier@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
ovs/vxlan: fix rtnl notifications on iface deletion
The function vxlan_dev_create() (only used by ovs) never calls
rtnl_configure_link(). The consequence is that dev->rtnl_link_stat is
never set to RTNL_LINK_INITIALIZED.
During the deletion phase, the function rollback_registered_many() sends
a RTM_DELLINK only if dev->rtnl_link_state is set to RTNL_LINK_INITIALIZED.
Note that the function vxlan_dev_create() is moved after the rtnl stuff so
that vxlan_dellink() can be called in this function.
Fixes: dcc38c033b32 ("openvswitch: Re-add CONFIG_OPENVSWITCH_VXLAN") CC: Thomas Graf <tgraf@suug.ch> CC: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
ovs/geneve: fix rtnl notifications on iface deletion
The function geneve_dev_create_fb() (only used by ovs) never calls
rtnl_configure_link(). The consequence is that dev->rtnl_link_state is
never set to RTNL_LINK_INITIALIZED.
During the deletion phase, the function rollback_registered_many() sends
a RTM_DELLINK only if dev->rtnl_link_state is set to RTNL_LINK_INITIALIZED.
Fixes: e305ac6cf5a1 ("geneve: Add support to collect tunnel metadata.") CC: Pravin B Shelar <pshelar@ovn.org> CC: Jesse Gross <jesse@ovn.org> CC: Thomas Graf <tgraf@suug.ch> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
ovs/gre,geneve: fix error path when creating an iface
After ipgre_newlink()/geneve_configure() call, the netdev is registered.
Fixes: 7e059158d57b ("vxlan, gre, geneve: Set a large MTU on ovs-created tunnel devices") CC: David Wragg <david@weave.works> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
Tx errors present summation of errors encountered while transmitting
packets.
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
udp: prevent skbs lingering in tunnel socket queues
In case we find a socket with encapsulation enabled we should call
the encap_recv function even if just a udp header without payload is
available. The callbacks are responsible for correctly verifying and
dropping the packets.
Also, in case the header validation fails for geneve and vxlan we
shouldn't put the skb back into the socket queue, no one will pick
them up there. Instead we can simply discard them in the respective
encap_recv functions.
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
udp_offload: Set encapsulation before inner completes.
UDP tunnel segmentation code relies on the inner offsets being set for
an UDP tunnel GSO packet, but the inner *_complete() functions will
set the inner offsets only if 'encapsulation' is set before calling
them. Currently, udp_gro_complete() sets 'encapsulation' only after
the inner *_complete() functions are done. This causes the inner
offsets having invalid values after udp_gro_complete() returns, which
in turn will make it impossible to properly segment the packet in case
it needs to be forwarded, which would be visible to the user either as
invalid packets being sent or as packet loss.
This patch fixes this by setting skb's 'encapsulation' in
udp_gro_complete() before calling into the inner complete functions,
and by making each possible UDP tunnel gro_complete() callback set the
inner_mac_header to the beginning of the tunnel payload.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Reviewed-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com> Reviewed-by: Jesse Gross <jesse@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
datapath: compat: get rid of OVS_CB inner header offsets.
OVS has GSO compat functionality which needs inner offset
of the packet to segment a packet. older kernel did not
include these offsets in skb, therefore these were stored
in OVS_GSO_CB. Now OVS has dropped support for these
old kernel, So none of the supported kernel needs this
comapt code. Following patch removes it.
Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
openvswitch: correct encoding of set tunnel action attributes
In a set action tunnel attributes should be encoded in a
nested action.
I noticed this because ovs-dpctl was reporting an error
when dumping flows due to the incorrect encoding of tunnel attributes
in a set action.
Fixes: fc4099f17240 ("openvswitch: Fix egress tunnel info.") Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
In case of UDP traffic with datagram length
below MTU this give about 2% performance increase
when tunneling over ipv4 and about 60% when tunneling
over ipv6
Signed-off-by: Paolo Abeni <pabeni@redhat.com> Suggested-and-acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Bug fix commit db3c6139e6e ("bpf, vxlan, geneve, gre: fix usage of
dst_cache on xmit"). is also included. Geneve changes
were added in 468dfffcd762cbb2777ec5a76bc21e3748ebf47e ("geneve: add
dst caching support")
Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
openvswitch: netlink attributes for IPv6 tunneling
Add netlink attributes for IPv6 tunnel addresses. This enables IPv6 support
for tunnels.
Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
datapath: compat: Update Geneve and VxLAN modules.
This patch brings in various updates to upstream Geneve and VxLAN
modules. For geneve this patch adds IPv6 support, for vxlan it adds
VXLAN GPE is the major feature.
This should make OVS compat tunnel implementation in sync upto
current net branch.
Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
datapath: compat: Add support for IPv6 UDP tunnel segmentation.
Next patch adds support for IPV6 Geneve and VXLAN, But support for UDP
segmentation is available on all supported kernel.
Following patch adds support for UDP tunnels over IPv6 for such kernels.
Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
By moving stats update into iptunnel_xmit(), we can simplify
iptunnel_xmit() usage. With this change there is no need to
call another function (iptunnel_xmit_stats()) to update stats
in tunnel xmit code path.
Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
Add gro_receive and gro_complete to struct udp_tunnel_sock_cfg.
Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
ip_tunnel: add support for setting flow label via collect metadata
This patch extends udp_tunnel6_xmit_skb() to pass in the IPv6 flow label
from call sites. Currently, there's no such option and it's always set to
zero when writing ip6_flow_hdr(). Add a label member to ip_tunnel_key, so
that flow-based tunnels via collect metadata frontends can make use of it.
vxlan and geneve will be converted to add flow label support separately.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
datapath: backport: ip_tunnel_core: iptunnel_handle_offloads returns int and doesn't free skb
There is return type change in upstream handle-offload functions.
Following patch brings these changes in.
This is backport of aed069df ("ip_tunnel_core:
iptunnel_handle_offloads returns int and doesn't free skb")
I have also removed duplicate definitions of tunnel_handle_offloads()
from ip-tunnel header.
Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
net: add dst_cache support
This patch add a generic, lockless dst cache implementation.
The need for lock is avoided updating the dst cache fields
only in per cpu scope, and requiring that the cache manipulation
functions are invoked with the local bh disabled.
The refresh_ts and reset_ts fields are used to ensure the cache
consistency in case of cuncurrent cache update (dst_cache_set*) and
reset operation (dst_cache_reset).
Consider the following scenario:
CPU1: CPU2:
<cache lookup with emtpy cache: it fails>
<get dst via uncached route lookup>
<related configuration changes>
dst_cache_reset()
dst_cache_set()
The dst entry set passed to dst_cache_set() should not be used
for later dst cache lookup, because it's obtained using old
configuration values.
Since the refresh_ts is updated only on dst_cache lookup, the
cached value in the above scenario will be discarded on the next
lookup.
Signed-off-by: Paolo Abeni <pabeni@redhat.com> Suggested-and-acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
If a packet is either locally encapsulated or processed through GRO
it is marked with the offloads that it requires. However, when it is
decapsulated these tunnel offload indications are not removed. This
means that if we receive an encapsulated TCP packet, aggregate it with
GRO, decapsulate, and retransmit the resulting frame on a NIC that does
not support encapsulation, we won't be able to take advantage of hardware
offloads even though it is just a simple TCP packet at this point.
This fixes the problem by stripping off encapsulation offload indications
when packets are decapsulated.
The performance impacts of this bug are significant. In a test where a
Geneve encapsulated TCP stream is sent to a hypervisor, GRO'ed, decapsulated,
and bridged to a VM performance is improved by 60% (5Gbps->8Gbps) as a
result of avoiding unnecessary segmentation at the VM tap interface.
Reported-by: Ramu Ramamurthy <sramamur@linux.vnet.ibm.com> Fixes: 68c33163 ("v4 GRE: Add TCP segmentation offload for GRE") Signed-off-by: Jesse Gross <jesse@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
Part of skb_scrub_packet was open coded in iptunnel_pull_header. Let it call
skb_scrub_packet directly instead.
Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
upstream tunnel egress info is retrieved using ndo_fill_metadata_dst.
Since we do not have it on older kernel we need to keep vport operation
to do same on these kernels.
Following patch try to merge these to operations into one to avoid code
duplication.
This commit backports fc4099f1 ("openvswitch:
Fix egress tunnel info.")
Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
The PMD thread needs to keep processing RX queues in order
to achieve maximum throughput. It also needs to sweep emc
cache and quiesce which use seq_mutex. That mutex can
eventually block the PMD thread causing latency spikes and
affecting the throughput.
Since there is no requirement for running those tasks at a
specific time, this patch extend seq API to allow tentative
locking instead.
Reported-by: Karl Rister <krister@redhat.com> Co-authored-by: Karl Rister <krister@redhat.com> Signed-off-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
netdev-dpdk: Obtain number of queues for vhost ports from attached virtio.
Currently, there are few inconsistencies in ways to configure number of
queues for netdev device:
* dpif-netdev can't know about exact number of queues
allocated inside netdev.
This leads to constant mapping of queue-ids to 'real' ones.
* We are able to configure 'n_rxq' for vhost-user devices, but
there is only one sane number of rx queues which must be used
and configured manually (number of queues that allocated
in QEMU).
This patch disables configuration of 'n_rxq' for DPDK vHost devices.
Configuration of rx and tx queues now automatically applied from
connected virtio device. Standard reconfiguration mechanism was used to
apply this changes.
Also, now 'n_txq' and 'n_rxq' are always the real numbers of queues
in the device.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Ryan Moats [Thu, 7 Jul 2016 18:37:04 +0000 (13:37 -0500)]
ovn-controller: Remove old address set after change.
Currently, when address set value changes, ovn controller
doesn't remove the old entry from the tracking hash, it
just adds the new one, leading to multiple entries for the
same symbol.
Fix this behavior and add a smoke test to avoid a regression
in the future.
Signed-off-by: Ryan Moats <rmoats@us.ibm.com> Acked-by: Flavio Fernandes <flavio@flaviof.com> Signed-off-by: Russell Bryant <russell@ovn.org>
netdev-linux: Do not log a warning if the device is down.
In the userspace datapath we use tap devices as internal netdev. The
datapath doesn't consider whether a device is up or down before sending
to it, and so far this hasn't been a problem.
Since Linux upstream commit 1bd4978a88ac("tun: honor IFF_UP in
tun_get_user()"), included in 4.4, writing to a tap device that is not
up sets errno to EIO. This commit avoids printing a warning in this
case.
This fixes a failures in the system-userspace-testsuites.
Reported-by: Joe Stringer <joe@ovn.org> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ben Pfaff <blp@ovn.org>
Ilya Maximets [Mon, 27 Jun 2016 13:28:16 +0000 (16:28 +0300)]
netdev-dpdk: Use instant sending instead of queueing of packets.
Current implementarion of TX packet's queueing is broken in several ways:
* TX queue flushing implemented on receive assumes that all
core_id-s are sequential and starts from zero. This may lead
to situation when packets will stuck in queue forever and,
also, this influences on latency.
* For a long time flushing logic depends on uninitialized
'txq_needs_locking', because it usually calculated after
'netdev_dpdk_alloc_txq' but used inside of this function
for initialization of 'flush_tx'.
Testing shows no performance difference with and without queueing.
Lets remove queueing at all because it doesn't work properly now and
also does not increase performance.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Add INSTALL.DPDK-ADVANCED document that is forked off from original
INSTALL.DPDK guide. This document is targeted at users looking for
optimum performance on OVS using dpdk datapath.
Refactor the INSTALL.DPDK in to two documents named INSTALL.DPDK and
INSTALL.DPDK-ADVANCED. While INSTALL.DPDK document shall facilitate the
novice user in setting up the OVS DPDK and running it out of box, the
ADVANCED document is targeted at expert users looking for the optimum
performance running dpdk datapath.
Paul Boca [Wed, 6 Jul 2016 12:38:32 +0000 (12:38 +0000)]
vlog test: Disable default syslog logger
Disable the syslog logger in case on Windows, '/dev/log' doesn't exist.
Seems like on Python34 a default handler is added to the logger and it prints
even if no handler is set by us.
Signed-off-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
ofproto_port_open_type should be used for netdev_open, but not for other tests.
For example, STP/RSTP check for interfaces of internal type, but that check will
fail when the netdev datapath is used.
The same thing goes for setting MAC address of internal Interfaces. That fails
for the netdev datapath because the interface type is set to "tap", but they are
still interfaces of type "internal", just their netdev implementation is
different.
Use a netdev_type for the type that needs to be used for netdev_open and
ofproto_port, while we still keep the type as the normalized configured type in
the database.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Signed-off-by: Jesse Gross <jesse@kernel.org>
Andy Zhou [Fri, 17 Jun 2016 22:41:26 +0000 (15:41 -0700)]
lib: Remove extra API dependency for ovs_thread_create()
When calling ovs_thread_create() without calling fatal_signal_init()
first, ovs_thread_create() some times asserts. This dependency is
subtle and not very obvious.
The root cause seems to be that, within ovs_thread_create(), the
multi-threaded state is declared before all initializations are done.
Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Tue, 5 Jul 2016 15:33:05 +0000 (08:33 -0700)]
netlink-notifier: Avoid valgrind possible leak warning.
This ensures that pointers to nln_notifiers are to the beginning of the
structs instead of to the middle, meaning that valgrind does not consider
them "possible" leaks.
Reported-by: William Tu <u9012063@gmail.com> Tested-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Sun, 3 Jul 2016 04:16:55 +0000 (21:16 -0700)]
bridge: Add assertion to document an invariant in find_local_hw_addr().
Avoids a possible null pointer dereference report from Clang.
Reported-at: http://openvswitch.org/pipermail/dev/2016-June/073967.html Signed-off-by: Ben Pfaff <blp@ovn.org> Tested-by: William Tu <u9012063@gmail.com>
This commit adds schema changes to the OVN_Northbound database to support
Load balancers.
In ovn-northd, it adds two logical tables to program logical flows.
It adds a 'pre_lb' table that sits before 'pre_stateful' table.
For packets that need to be load balanced, this table sets reg0[0]
to act as a hint for the pre-stateful table to send the packet to
the conntrack table for defragmentation.
It also adds a 'lb' table that sits before 'stateful' table.
For packets from established connections, this table sets reg0[2] to
indicate to the 'stateful' table that the packet needs to be sent to
connection tracking table to just do NAT.
In stateful table, packet for a new connection that needs to be load balanced
is given a ct_lb($IP_LIST) action.
Signed-off-by: Gurucharan Shetty <guru@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
ovn-controller now supports 2 new logical actions.
1. ct_lb;
Sends the packet through the conntrack zone to NAT
packets. Packets that are part of established connection
will automatically get NATed based on the NAT arguments
supplied to conntrack when the first packet was committed.
2. ct_lb(192.168.1.2, 192.168.1.3);
ct_lb(192.168.1.2:80, 192.168.1.3:80);
Creates an OpenFlow group with multiple buckets and equal weights
that changes the destination IP address (and port number) of the packet
statefully to one of the options provided inside the parenthesis.
Signed-off-by: Gurucharan Shetty <guru@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ryan Moats [Sun, 3 Jul 2016 15:35:28 +0000 (10:35 -0500)]
Change tracking structures to use struct uuids
In encaps.c, binding.c, and lport.c incremental processing
is aided by tracking entries by their ovsdb row uuids.
The original patch sets used pointers, which might lead
to errors if the ovsdb row uuid memory is released. So,
use actual structures to hold the values instead.
Signed-off-by: Ryan Moats <rmoats@us.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Currently, the only use of stateful services in conntrack is
OVN ACLs. In table ACL, we commit the packet to conntrack
via ct_commit action.
As we introduce more stateful services, the ACL feature will
have to share the conntrack module with others. As
preparation for more stateful features like load balancing,
this commit introduces a new stateful table
that is responsible to commit packets to conntrack via
ct_commit action. If ACL table needs to commit a packet,
it sets 'reg0[1]' as 1. Stateful table in-turn will commit
the packet if 'reg0[1]' is 1.
Signed-off-by: Gurucharan Shetty <guru@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Currently, the only use of stateful services in conntrack is
OVN ACLs. In table pre-ACL, we send the packet to conntrack
to track it (to get its status) and to defrag via the ct_next
action.
As we introduce more stateful services, the ACL feature will
have to share the conntrack module with others. As
preparation for more stateful features like loadbalancing,
this commit introduces a new pre-stateful table that is
responsible to send packets through conntrack via
ct_next action. If pre-ACL table needs to send a packet
through conntrack, it just sets the 'reg0[0]' as 1.
Pre-stateful table in-turn will send the packet to conntrack
if 'reg0[0]' is 1.
Signed-off-by: Gurucharan Shetty <guru@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
This example match only has 3 addresses, but it could easily have
hundreds of addresses. In some cases, the same large set of addresses
needs to be used in several ACLs.
This patch adds a new Address_Set table to OVN_Northbound so that a set
of addresses can be specified once and then referred to by name in ACLs.
To recreate the above example, you would first create an address set:
ovn-controller: process lport bindings only when transaction is possible
As currently implemented, binding_run() normally updates the set of
locally owned logical ports on each call. When changes to the
membership of this set are detected (i.e. when locally bound
logical ports are added or deleted), additional processing to
update the sb database with lport binding is performed.
However, the sb database can only be updated when a transaction to
the sb database is possible (that is, when ctx->ovnsb_idl_txn is
non-NULL). If a new logical port is detected while ctx->ovnsb_idl_txn
happens to be NULL, its binding information will not be updated in
the the sb database until another change to the set of locally-owned
logical ports changes. If no such change ever occurs, the sb database
is never updated with the appropriate binding information.
Eliminate this issue by only updating the set of locally owned logical
ports when an sb database transaction is possible. This addresses
a cause of occasional failures in the "3 HVs, 3 LS, 3 lports/LS, 1 LR"
test case.
The failing scenario goes like this:
1) Test case logical network setup is complete.
2) The last physical network port is added via
as hv3 ovs-vsctl --add-port ... --set Interface vif333 external-ids:iface-id=lp333
3) hv3 ovn-controller receives update from hv3 ovsdb-server with above mapping,
binding_run() is called, and ctx->ovnsb_idl_txn happens to be NULL.
4) binding_run() calls get_local_iface_ids(), which recognizes the new
local port as matching a logical port, so the lp333 is added to the
global ssets "lports" and "all_lports". This means lp333 will not be treated
as a new logical port on subsequent calls. Because getLocal_iface_ids()
has discovered a new lport, it returns changed = true.
5) Because get_local_iface_ids() returned true, binding_run() sets process_full_binding
to true.
6) Because process_full_binding is true, binding_run() calls consider_local_datapath()
for each logical port in shash_lports (which now includes lp333).
7) consider_local_datapath() processing returns without calling
sbrec_port_binding_set_chassis() because ctx->ovnsb_idl_txn is NULL.
8) There are subsequent calls to binding_run() with non-NULL ctx->ovnsb_idl,
but because lp333 is already in the "lports" sset, get_local_iface_ids()
returns changed=false, so process_full_binding is false, which means
consider_local_datapath() is not called for lp333.
9) Because consider_local_datapath() is not called for lp333, the sb database
is not updated with the lport/chassis binding.
Hopefully the above is intelligible. Another way of looking at it would be
to say the condition for calling consider_local_datapath() is an "edge trigger",
this change suppresses the trigger until the necessary actions can be performed.
Signed-off-by: Lance Richardson <lrichard@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
If there are multiple logical switches or routers with a duplicate name,
the configuration is slightly different. You should configure the logical
switches or routers using the UUID instead of the name.
Signed-off-by: nickcooper-zhangtonghao <nickcooper-zhangtonghao@opencloud.tech> Signed-off-by: Ben Pfaff <blp@ovn.org>
When new tables are introduced, it gets a little harder to
track all the different table numbers used in the documentation.
This commit changes some table numbers to names to make it a little
easier to update documentation when new tables are introduced in the
upcoming commits.
Signed-off-by: Gurucharan Shetty <guru@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Future patches introduce more tables between
pre-ACL and ACL processing. As such, it looks
easier to separate these out into separate
functions to enhance code readability.
Signed-off-by: Gurucharan Shetty <guru@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
William Tu [Wed, 29 Jun 2016 21:38:02 +0000 (14:38 -0700)]
ofproto-dpif-mirror: Add mirror snaplen support.
This patch adds a 'snaplen' config for mirroring table. A mirrored packet
with size larger than snaplen bytes will be truncated in datapath before
sending to the mirror output port.
Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/141186839 Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
William Tu [Wed, 29 Jun 2016 17:35:00 +0000 (10:35 -0700)]
vagrant: Add FreeBSD 10.2 box support.
Add FreeBSD 10.2 vagrant file "Vagrantfile-FreeBSD". Users can run
'VAGRANT_VAGRANTFILE=Vagrantfile-FreeBSD vagrant up' to test basic
OVS configure, build, and check.
Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
William Tu [Wed, 29 Jun 2016 05:02:26 +0000 (22:02 -0700)]
ovn-nbctl: Fix double free in nbctl_lr_route_list().
The intent here was to free the error reported by ipv6_parse_cidr(),
but in fact the error reported by that function was discarded and
the previous error from ip_parse_cidr() was freed again.
Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Jan Scheurich [Tue, 28 Jun 2016 22:29:25 +0000 (00:29 +0200)]
ofproto: Add relaxed group_mod command ADD_OR_MOD
This patch adds support for a new Group Mod command OFPGC_ADD_OR_MOD to
OVS for all OpenFlow versions that support groups (OF11 and higher).
The new ADD_OR_MOD creates a group that does not yet exist (like ADD)
and modifies an existing group (like MODIFY).
Rational: In OpenFlow 1.x the Group Mod commands OFPGC_ADD and
OFPGC_MODIFY have strict semantics: ADD fails if the group exists,
while MODIFY fails if the group does not exist. This requires a
controller to exactly know the state of the switch when programming a
group in order not run the risk of getting an OFP Error message in
response. This is hard to achieve and maintain at all times in view of
possible switch and controller restarts or other connection losses
between switch and controller.
Due to the un-acknowledged nature of the Group Mod message programming
groups safely and efficiently at the same time is virtually impossible
as the controller has to either query the existence of the group prior
to each Group Mod message or to insert a Barrier Request/Reply after
every group to be sure that no Error can be received at a later stage
and require a complicated roll-back of any dependent actions taken
between the failed Group Mod and the Error.
In the ovs-ofctl command line the ADD_OR_MOD command is made available
through the new option --may-create in the mod-group command: