Russell Bryant [Thu, 27 Jul 2017 00:30:34 +0000 (20:30 -0400)]
Documentation/conf.py: Fix line length.
A previous commit introduced a line that was greater than 79
characters long, causing a flake8 warning to be emitted.
Reported-by: Joe Stringer <joe@ovn.org> Fixes: 5ca89127382d ("docs: Refer to correct package name for sphinx theme.") Signed-off-by: Russell Bryant <russell@ovn.org>
openvswitch: fix potential out of bound access in parse_ct
Before the 'type' is validated, we shouldn't use it to fetch the
ovs_ct_attr_lens's minlen and maxlen, else, out of bound access
may happen.
Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action") Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Pick up an upstream bug fix.
Fixes: a94ebc39996b ("datapath: Add conntrack action") Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Joe Stringer <joe@ovn.org>
Joe Stringer [Wed, 26 Jul 2017 19:49:48 +0000 (12:49 -0700)]
system-userspace-macros: Fix ethtool with new kernels.
The latest net-next kernels have removed the UFO feature, which results
in older ethtool reporting the following error:
Cannot get device udp-fragmentation-offload settings: Operation not
supported
Currently, we rely on no errors being reported, and if there is an error
then a failure is reported. However, in this case we can safely ignore
the stderr output. We still check the return code so if something is
truly fatal, a failure will still be reported; otherwise, we will not
fail the test due to the above.
Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Localnet port is not an endpoint, and have no security requirements
to use localnet port at present. So, for performance consideration, we
could do not use ct for localnet port.
The more specific discussion can be found from
https://mail.openvswitch.org/pipermail/ovs-dev/2017-July/335048.html
Signed-off-by: wangqianyu <wang.qianyu@zte.com.cn> Acked-by: Han Zhou <zhouhan@gmail.com> Signed-off-by: Russell Bryant <russell@ovn.org>
Currently to check more than one patch or file it's required
to invoke script for each file separately.
Fix that by iterating over all the passed filenames.
Note: If '-f' option passed, all the files treated as usual files.
Without '-f' all the files treated as patch files.
Avoid null pointer dereference in fdb_calculate_active_tunnels()
when integration bridge isn't present. This is easily encountered
by executing "make sandbox SANDBOXFLAGS=--ovn".
Fixes: 3475695ea61c ("ovn: l3ha, enable bfd between tunnel endpoints") Signed-off-by: Lance Richardson <lrichard@redhat.com> Signed-off-by: Russell Bryant <russell@ovn.org>
Andy Zhou [Tue, 25 Jul 2017 18:28:37 +0000 (11:28 -0700)]
bond: Adjust bond hash masks
Commit 42781e77035d (bond: Unify hash functions in hash action and entry
lookup.) changed the BM_TCP's hash function, but did not update
hash mask fields accordingly. Found by inspection.
netdev-dummy: Fix setting length in recieve command.
Currently, if '--len' option passed to 'netdev-dummy/receive' command,
only 'size' field of dp_packet will changes.
This is incorrect behaviour, because memory for that size is not
allocated and also packet headers not fixed to reflect the new size.
This leads to flow_extract() failure, because it checks the
'ip->tot_len' and stops further parsing if it doesn't match the
dp_packet_size(). As a result packets created while processing of the
'receive' command can't be parsed to the same flow.
Additionally this may lead to wrong memory accesses in case someone
will try to read or modify packets data.
Fix that by creating right packets using recently introduced
'flow_compose_size()'.
CC: Andy Zhou <azhou@ovn.org> Fixes: d8ada2368cbe ("netdev-dummy: Add --len option for netdev-dummy/receive command") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Andy Zhou <azhou@ovn.org>
This allows to compose packets with different real lenghts from
odp flows i.e. memory will be allocated for requested packet
size and all required headers like ip->tot_len filled correctly.
Will be used in netdev-dummy to properly handle '--len' option.
Suggested-by: Andy Zhou <azhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Andy Zhou <azhou@ovn.org>
Mark Michelson [Fri, 21 Jul 2017 20:46:00 +0000 (15:46 -0500)]
stream-ssl: Fix memory leak in error scenario
ssl_new_stream() takes ownership of the passed-in 'name' parameter.
In error scenarios, the name is leaked. I was able to trigger this
leak by attempting to connect to an ovsdb over SSL and specifying
non-existent certificate, private key, and CA cert files.
This patch fixes the problem by freeing 'name' in the error label.
Signed-off-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Russell Bryant <russell@ovn.org>
Since introduction of 'hash_mac()' function in
commit 7e36ac42e33a ("lib/packet.h: add hash_mac()"), there is no
need to have additional wrapper for mac address hashing.
Lets use 'hash_mac()' directly and remove 'bond_hash_src()' to
simplify the code.
Suggested-by: Andy Zhou <azhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Andy Zhou <azhou@ovn.org>
bond: Unify hash functions in hash action and entry lookup.
'lookup_bond_entry' currently uses 'flow_hash_symmetric_l4' while
OVS_ACTION_ATTR_HASH uses 'flow_hash_5tuple'. This may lead to
inconsistency in slave choosing for the new flows. In general,
there is no point to unify hash functions, because it's not
required for correct work, but it's logically wrong to use
different hash functions there.
Unfortunately we're not able to use RSS hash here, because we have
no packet at this point, but we may reduce inconsistency by using
'flow_hash_5tuple' instead of 'flow_hash_symmetric_l4' because
symmetric quality is not needed.
'flow_hash_symmetric_l4' was used previously just because there
was no other implemented hash function at the moment and L2
fields was additionally involved in hash calculation. Now we
have 5tuple hash and L2 not used anymore, so, we may replace the
old function.
'flow_hash_5tuple' is preferable solution because it in 2 - 8 times
(depending on the flow) faster than symmetric function.
So, this change will also speed up handling of the new flows and
statistics accounting.
Additionally function 'bond_hash_tcp()' was removed for the reasons
of code simplification and possible additional speed up.
Co-authored-by: Andy Zhou <azhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Andy Zhou <azhou@ovn.org>
vswitch.xml: Fix L2 balancing mentioning for balance-tcp bond.
L2 fields are not used in userspace hash action since
commit 4f150744921f ("dpif-netdev: Use miniflow as a flow key.").
In kernel datapath RSS (which is not include L2 by default for
most of the NICs) was used from the beginning. This means that
if recirculation is in use, L2 fields are not used for flow
balancing.
Fix the documentation accordingly.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Andy Zhou <azhou@ovn.org>
Russell Bryant [Mon, 24 Jul 2017 20:52:30 +0000 (16:52 -0400)]
ovn-architecture: Remove outdated comment.
This outdated comment said that support for hardware gateways that
support the vtep schema would come later. This was actually
implemented a long time ago.
Signed-off-by: Russell Bryant <russell@ovn.org> Acked-by: Miguel Angel Ajo <majopela@redhat.com>
When there is an established connection in direction A->B, it is
possible to receive a packet on port B which then executes
ct(commit,force) without first performing ct() - ie, a lookup.
In this case, we would expect that this packet can delete the
existing entry so that we can commit a connection with direction B->A.
However, currently we only perform a check in skb_nfct_cached() for
whether OVS_CS_F_TRACKED is set and OVS_CS_F_INVALID is not set, ie
that a lookup previously occurred. In the above scenario, a lookup
has not occurred but we should still be able to statelessly look
up the existing entry and potentially delete the entry if it is
in the opposite direction.
This patch extends the check to also hint that if the action has the
force flag set, then we will lookup the existing entry so that the
force check at the end of skb_nfct_cached has the ability to delete
the connection.
Fixes: dd41d330b03 ("openvswitch: Add force commit.") CC: Pravin Shelar <pshelar@nicira.com> CC: dev@openvswitch.org Signed-off-by: Joe Stringer <joe@ovn.org> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Co-authored-by: Joe Stringer <joe@ovn.org> Signed-off-by: Joe Stringer <joe@ovn.org> Signed-off-by: Greg Rose <gvrose8192@gmail.com>
openvswitch: fix mis-ordered comment lines for ovs_skb_cb
I was trying to wrap my head around meaning of mru, and realised
that the second line of the comment defining it had somehow
ended up after the line defining cutlen, leading to much confusion.
Reorder the lines to make sense.
Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Joe Stringer <joe@ovn.org>
When compiling OvS-master on 4.4.0-81 kernel,
there is a warning:
CC [M] /root/ovs/datapath/linux/datapath.o
/root/ovs/datapath/linux/datapath.c: In function
'ovs_flow_cmd_set':
/root/ovs/datapath/linux/datapath.c:1221:1: warning:
the frame size of 1040 bytes is larger than 1024 bytes
[-Wframe-larger-than=]
This patch factors out match-init and action-copy to avoid
"Wframe-larger-than=1024" warning. Because mask is only
used to get actions, we new a function to save some
stack space.
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: Joe Stringer <joe@ovn.org>
Switches and modern SR-IOV enabled NICs may multiplex traffic from Port
representators and control messages over single set of hardware queues.
Control messages and muxed traffic may need ordered delivery.
Those requirements make it hard to comfortably use TC infrastructure today
unless we have a way of attaching metadata to skbs at the upper device.
Because single set of queues is used for many netdevs stopping TC/sched
queues of all of them reliably is impossible and lower device has to
retreat to returning NETDEV_TX_BUSY and usually has to take extra locks on
the fastpath.
This patch attempts to enable port/representative devs to attach metadata
to skbs which carry port id. This way representatives can be queueless and
all queuing can be performed at the lower netdev in the usual way.
Traffic arriving on the port/representative interfaces will be have
metadata attached and will subsequently be queued to the lower device for
transmission. The lower device should recognize the metadata and translate
it to HW specific format which is most likely either a special header
inserted before the network headers or descriptor/metadata fields.
Metadata is associated with the lower device by storing the netdev pointer
along with port id so that if TC decides to redirect or mirror the new
netdev will not try to interpret it.
This is mostly for SR-IOV devices since switches don't have lower netdevs
today.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: 3fcece12bc1b ("net: store port/representator id in metadata_dst") Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Greg Rose <gvrose8192@gmail.com>
There is no good reason to keep the flags twice in vxlan_dev and
vxlan_config.
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Applied using HAVE_VXLAN_DEV_CFG compatibility flag defined in
acinclude.m4.
Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Joe Stringer <joe@ovn.org>
Greg Rose [Fri, 21 Jul 2017 23:46:08 +0000 (16:46 -0700)]
compat: Implement upstream net device free change.
Upstream commit cf124db566e6 ("net: Fix inconsistent teardown and
release of private netdev state.") removed the destructor member
of the net_device structure and replaced it with a boolean flag
indicating that the net device resource needs freeing. Use
compat flag HAVE_NEEDS_FREE_NETDEV to indicate whether the new
flag should be used.
Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Joe Stringer <joe@ovn.org>
Apply it to the tree (with one manual fixup to keep the
comment in vxlan.c, which spatch removed.)
Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Use e45a79da863c ("skbuff/mac80211: introduce and use skb_put_zero()")
as the basis for the backported function.
Upstream: de77b966ce8a ("networking: convert many more places to skb_put_zero()") Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Greg Rose <gvrose8192@gmail.com>
net: Fix inconsistent teardown and release of private netdev state.
Network devices can allocate reasources and private memory using
netdev_ops->ndo_init(). However, the release of these resources
can occur in one of two different places.
Either netdev_ops->ndo_uninit() or netdev->destructor().
The decision of which operation frees the resources depends upon
whether it is necessary for all netdev refs to be released before it
is safe to perform the freeing.
netdev_ops->ndo_uninit() presumably can occur right after the
NETDEV_UNREGISTER notifier completes and the unicast and multicast
address lists are flushed.
netdev->destructor(), on the other hand, does not run until the
netdev references all go away.
Further complicating the situation is that netdev->destructor()
almost universally does also a free_netdev().
This creates a problem for the logic in register_netdevice().
Because all callers of register_netdevice() manage the freeing
of the netdev, and invoke free_netdev(dev) if register_netdevice()
fails.
If netdev_ops->ndo_init() succeeds, but something else fails inside
of register_netdevice(), it does call ndo_ops->ndo_uninit(). But
it is not able to invoke netdev->destructor().
This is because netdev->destructor() will do a free_netdev() and
then the caller of register_netdevice() will do the same.
However, this means that the resources that would normally be released
by netdev->destructor() will not be.
Over the years drivers have added local hacks to deal with this, by
invoking their destructor parts by hand when register_netdevice()
fails.
Many drivers do not try to deal with this, and instead we have leaks.
Let's close this hole by formalizing the distinction between what
private things need to be freed up by netdev->destructor() and whether
the driver needs unregister_netdevice() to perform the free_netdev().
netdev->priv_destructor() performs all actions to free up the private
resources that used to be freed by netdev->destructor(), except for
free_netdev().
netdev->needs_free_netdev is a boolean that indicates whether
free_netdev() should be done at the end of unregister_netdevice().
Now, register_netdevice() can sanely release all resources after
ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit()
and netdev->priv_destructor().
And at the end of unregister_netdevice(), we invoke
netdev->priv_destructor() and optionally call free_netdev().
Signed-off-by: David S. Miller <davem@davemloft.net>
Applied the portion of the commit applicable to openvswitch.
Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Joe Stringer <joe@ovn.org>
openvswitch: more accurate checksumming in queue_userspace_packet()
if skb carries an SCTP packet and ip_summed is CHECKSUM_PARTIAL, it needs
CRC32c in place of Internet Checksum: use skb_csum_hwoffload_help to avoid
corrupting such packets while queueing them towards userspace.
Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Joe Stringer <joe@ovn.org>
netfilter: introduce nf_conntrack_helper_put helper function
And convert module_put invocation to nf_conntrack_helper_put, this is
prepared for the followup patch, which will add a refcnt for cthelper,
so we can reject the deleting request when cthelper is in use.
Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Applied with additional use of HAVE_NF_CONNTRACK_HELPER_PUT compatibility
flag defined in acinclude.m4.
Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Joe Stringer <joe@ovn.org>
tests: Check ip command whether support udp6zerocsum.
The version of ip-route may not support udp6zerocsum for
vxlan6 or geneve6. If we run the kernel check, there may
be always error message. Before running the test units,
we check the ip command.
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Eric Garver <e@erig.me> Signed-off-by: Joe Stringer <joe@ovn.org>
System-tests: Improve reliability of an icmp test.
One SNAT test is based on a single ping being successful;
to make the result more predictable, static arp binding is now used.
To put less stress on the stack a single arp binding is used for
the reverse direction mapping. This does not change the goal of the
test, but significantly increases the reliability; I ran the test
100 times without failure.
Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Joe Stringer <joe@ovn.org>
Three of the SNAT tests allow for wget retries, which occasionally
happen. However, these tests did not allow for SNAT address
variability for the retries, which is now tolerated.
Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Joe Stringer <joe@ovn.org>
Russell Bryant [Sun, 16 Jul 2017 19:39:56 +0000 (15:39 -0400)]
docs: Note currently used L3 gateway HA approach.
The OVN gateway HA design document is very useful in its current form.
It describes a range of options OVN could take to provide gateway HA.
Leave all the useful discussion in place and add a note to indicate
how the current implementation lines up with the options described.
I plan to follow up with an additional patch to describe the current L3
gateway HA implementation in the ovn-architecture document.
Signed-off-by: Russell Bryant <russell@ovn.org> Acked-by: Miguel Angel Ajo <majopela@redhat.com>
Ovs and kernel stack would add frag_queue to same netns_frags list.
As result, ovs and kernel may access the fraq_queue without correct
lock. Also the struct ipq may be different on kernel(older than 4.3),
which leads to invalid pointer access.
The fix creates specific netns_frags for ovs.
Signed-off-by: wangzhike <wangzhike@jd.com> Signed-off-by: Joe Stringer <joe@ovn.org>
rhel/systemd: Set ovs-vswitchd timeout to 5 minutes
During initialization, it's possible that the startup time takes longer
than the systemd default provided. Set this to be 5 minutes. If we
take longer than 5 minutes, maybe something is wrong.
As an example of long initialization, enable DPDK, and allocate large
numbers of hugepages before starting ovs-vswitchd. The vswitchd can
take two or more minutes to start. During that time, systemd will decide
that the startup time took too long, and kill the parent process, leading
eventually to an error like:
ovs|00011|daemon_unix|EMER|pipe write failed (Broken pipe)
And a systemd log like:
ovs-vswitchd.service start operation timed out. Terminating.
The 5 minutes setting has been observed to work on a system where 400G
of hugepages were allocated.
Russell Bryant [Sun, 16 Jul 2017 20:07:12 +0000 (16:07 -0400)]
ovn-architecture: Add notes on L3 gateway HA.
Add some comments to the ovn-architecture document that distributed
gateway ports can also be made highly available. Provide a brief
overview of the approach and point to the gateway HA design document
for a more detailed discussion of the approach taken.
Signed-off-by: Russell Bryant <russell@ovn.org> Acked-by: Miguel Angel Ajo <majopela@redhat.com>
odp-execute: Reuse rss hash in OVS_ACTION_ATTR_HASH.
If RSS hash exists in a packet it can be reused instead of
5 tuple hash re-calculation in OVS_ACTION_ATTR_HASH. This
leads to increasing the performance of sending packets to
the OVS bonding in userspace datapath up to 10-15%.
Additionally fixed unit test 'select group with dp_hash
selection method' to not depend on dp_hash value.
dpif-netdev: Indicate support for various ct features.
The userspace datapath uses a structure to indicate supported features
that affects debug output. This commit updates that structure to
indicate that "ct_state_nat", "ct_orig_tuple", and "ct_orig_tuple6" are
supported.
tunneling: Avoid datapath-recirc by combining recirc actions at xlate.
This patch set removes the recirculation of encapsulated tunnel packets
if possible. It is done by computing the post tunnel actions at the time of
translation. The combined nested action set are programmed in the datapath
using CLONE action.
The following test results shows the performance improvement offered by
this optimization for tunnel encap.
tunneling: Calculate and update packet l4_offset in tunnel push.
The following tunnel combine patch series avoids the packets recirculation
after the tunnel push. So it is necessary to populate all relevant packet meta
data fields for the following combined action-set.
In this eg: the last tunnel_pop operation uses the l4_offset in the packet to
validate the packets. So it must be calculated and updated in the packet before
executing the action. Since there is no recirculation now on, this calculation
is doing as part of tunnel_push.
Signed-off-by: Sugesh Chandran <sugesh.chandran@intel.com> Signed-off-by: Zoltán Balogh <zoltan.balogh@ericsson.com> Co-authored-by: Zoltán Balogh <zoltan.balogh@ericsson.com> Signed-off-by: Joe Stringer <joe@ovn.org>
xlate: Clear tunnel mask along with other fields while combine actions.
The tunnel mask in the translation state should be cleared along with other
context fields. It is necessary in 'apply_nested_clone_actions' as it will be
used to combine post tunnel output actions with tunnel push. This will assure
right openflow state while executing the translation.
Signed-off-by: Sugesh Chandran <sugesh.chandran@intel.com> Signed-off-by: Zoltán Balogh <zoltan.balogh@ericsson.com> Co-authored-by: Zoltán Balogh <zoltan.balogh@ericsson.com> Signed-off-by: Joe Stringer <joe@ovn.org>
Joe Stringer [Tue, 18 Jul 2017 22:32:44 +0000 (15:32 -0700)]
dpif-netlink: For non-Ethernet, use Ethertype from packet_type.
For non-Ethernet flows, when fixing up the netlink message we need make
sure to pass down a valid Ethertype. The kernel does not understand
packet_type so it's implicitly encoded by the absence of _ETHERNET and
exact match of _ETHERTYPE. Without this change match_validate in the
kernel complains when trying to match packets from L3 tunnels.
e.g.
openvswitch: netlink: Unexpected mask (mask=110088, allowed=3d9804c)
The mask use to always be set in xlate_wc_init() and xlate_wc_finish(),
but that changed for non-Ethernet frames with the commit listed below.
Fixes: 3d4b2e6eb74e ("userspace: Add OXM field MFF_PACKET_TYPE") Signed-off-by: Joe Stringer <joe@ovn.org> Co-authored-by: Eric Garver <e@erig.me> Acked-by: Eric Garver <e@erig.me>
Joe Stringer [Tue, 18 Jul 2017 22:32:43 +0000 (15:32 -0700)]
dpif-netlink: Use netlink helpers for packet_type.
Rather than open-coding access to netlink attribute pointers in
put_exclude_packet_type(), make use of the netlink attribute helpers.
This simplifies the following bugfix.
Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Eric Garver <e@erig.me>
Yang, Yi Y [Fri, 7 Jul 2017 03:02:13 +0000 (11:02 +0800)]
datapath: enable VxLAN-gpe port creation in compat mode
In compat mode, ovs can't create L3 VxLAN-gpe port in old
kernels if port creation failed by rtnetlink, this patch
enables old kernels to create L3 VxLAN-gpe port.
Signed-off-by: Yi Yang <yi.y.yang@intel.com> Signed-off-by: Joe Stringer <joe@ovn.org>
odp-util: Document size of OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV4.
This attribute is exclusive of OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV6 so it
doesn't take up additional space (IPv6 is larger), but it's still worth
documenting.
Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
ofproto-dpif-xlate: Fixes for propagating state of conntrack.
The "ct" action is not supposed to make the various ct match fields
available except for the pipeline instantiated through the "table"
argument to the "ct" action. This commit fixes a few issues related to
that and updates the tests appropriately.
Ben Pfaff [Mon, 17 Jul 2017 16:54:54 +0000 (09:54 -0700)]
connmgr: Fix crash when in_band_create() fails.
update_in_band_remotes() created an in-band manager and then tried to work
with it without first checking whether creation had succeeded. If it
failed, this led to a segfault.
Ben Pfaff [Fri, 14 Jul 2017 21:33:46 +0000 (14:33 -0700)]
Support IPv6 link-local address scopes on Linux.
I hadn't even heard of this feature before, but it seems to be at least
semi-standard to support Linux link-local address scopes via a % suffix,
e.g. fe80::1234%eth0 for a link-local address scoped to eth0. This commit
adds support.
I'd appreciate feedback from folks who understand this feature better than
me.
Ben Pfaff [Fri, 14 Jul 2017 21:33:45 +0000 (14:33 -0700)]
socket-util: Change ss_format_address() to take a dynamic string.
It's occasionally convenient to format into a fixed-size buffer, but
as the use cases, and the text to be formatted, get more sophisticated,
it becomes easier to deal with "struct ds *" than a buffer pointer and
length pair. An upcoming commit will make ss_format_address() do more
work, and I think that this is the point at which it becomes easier to
take a dynamic string. This commit makes the parameter type change
without yet changing what is formatted.
ovn: l3ha ensure no master bouncing when ovn-controller is restarted
When ovn-controller is restarted, ovn-controller removes the old
Chassis entry from the SBDB and a new one is inserted.
This cleared the Gateway_Chassis chassis column in the SBDB and then
ovn-northd removed the empty-column Gateway_Chassis entry.
Such event made the other (non-restarted and master gateway chassis)
believe that he was a single (non-HA) gateway, turning off BFD and
releasing the port for a tiny time frame causing unnecesary downtime.
Signed-off-by: Miguel Angel Ajo <majopela@redhat.com> Signed-off-by: Russell Bryant <russell@ovn.org>
This patch extends gratuitous ARP support for NAT addresses so that it
applies to centralized NAT rules on a HA router.
Gratuitous ARP packets for centralized NAT rules on a HA router
are only generated on the active gateway chassis.
Signed-off-by: Anil Venkata <vkommadi@redhat.com> Signed-off-by: Russell Bryant <russell@ovn.org>
ovn: l3ha, make is_chassis_active aware of gateway_chassis
is_chassis_active now is only true for redirect-chassis ports
in the case of the specific lport being active on the
local chassis.
This will naturally make the ARP responder / redirection openflow
rules automatically inserted/removed when a router goes active/backup
in a way that BACKUP routers won't respond to ARP on gateway ports,
and they won't route packets that arrive on the wrong gateway
chassis (that can happen until all hypervisors converge in the
new MASTER thanks to the BFD monitoring of the tunnel endpoints).
Signed-off-by: Miguel Angel Ajo <majopela@redhat.com> Signed-off-by: Russell Bryant <russell@ovn.org>
This patch enables bfd protocol between gateways and transport nodes,
all gateway nodes with any HA chassisredirect port will enable BFD
to all tunnel endpoints, while transport nodes will enable BFD
to all tunnel endpoints hosting an HA gateway chassisredirect port.
Signed-off-by: Venkata Anil <vkommadi@redhat.com> Signed-off-by: Miguel Angel Ajo <majopela@redhat.com> Co-Authored-by: Miguel Angel Ajo <majopela@redhat.com> Signed-off-by: Russell Bryant <russell@ovn.org>
This patch handles multiple gateway_chassis within chassisredirect
ports. All the gateway_chassis within chassisredirect port
will implement the rules to de-encapsulate incoming packets
for such port (please note that later patches in the series
will make is_chassis_redirect conditionals aware of the
MASTER/BACKUP status of the chassis).
Hosts targeting a remote chassisredirect port will setup a
bundle(active_backup, ..) action to each tunnel port, in the given
priority order. Following patches will enable BFD to detect
when a remote gateway chassis is no longer reachable.
Co-authored-by: Venkata Anil Kommaddi <vkommadi@redhat.com> Signed-off-by: Miguel Angel Ajo <majopela@redhat.com> Signed-off-by: Venkata Anil Kommaddi <vkommadi@redhat.com> Signed-off-by: Russell Bryant <russell@ovn.org>
ovn: l3ha, NBDB and SBDB changes and documentation
This commit introduces the north and south db changes necessary for
the l3ha router implementation.
It defines a new Table in both NBDB and SBDB.
The Gateway_Chassis table is created, with a tiny difference between
NBDB and SBDB, NBDB references the chassis via it's name (chassis_name)
and SBDB references the chassis via reference (chassis) to the Chassis table.
In NBDB a new column (gateway_chassis) is added to Logical_Router_Ports
with a list of Gateway_Chassis which can be empty.
In SBDB a new column (gateway_chassis) is added to Port_Binding with
the same list, this column will be used for ports of type chassis-redirect.
Bump minor version since we've added new backwards compatible features.
Co-authored-by: Russell Bryant <russell@ovn.org> Signed-off-by: Miguel Angel Ajo <majopela@redhat.com> Signed-off-by: Russell Bryant <russell@ovn.org>
BRE alternative (\|) is an GNU sed extension. [1]
It isn't available in NetBSD sed.
[1] http://www.gnu.org/software/sed/manual/sed.html#Regular-Expressions
regexp1\|regexp2
Matches either regexp1 or regexp2. Use parentheses to use
complex alternative regular expressions. The matching process
tries each alternative in turn, from left to right, and the
first one that succeeds is used. It is a GNU extension.
Signed-off-by: YAMAMOTO Takashi <yamamoto@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
- It can cause problems for some utilities (e.g. ovs-pcap) after
The recent change. [1]
See also: http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=51152
xlate: Refactor translation of patch port output actions.
Outputting to a patch port is translated by its peer patch port actions.
Refactoring the translation part to use later in the patch series for the
tunnel push.
Signed-off-by: Sugesh Chandran <sugesh.chandran@intel.com> Signed-off-by: Zoltán Balogh <zoltan.balogh@ericsson.com> Co-authored-by: Zoltán Balogh <zoltan.balogh@ericsson.com> Signed-off-by: Joe Stringer <joe@ovn.org>
Ben Pfaff [Fri, 14 Jul 2017 21:20:07 +0000 (14:20 -0700)]
sparse: Add missing protoype for sendmmsg.
Reported-by: Joe Stringer <joe@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org> Fixes: 00f5565c7eed ("socket-util: Fix recursion issue in sendmmsg") Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
Mark Michelson [Fri, 14 Jul 2017 19:27:24 +0000 (14:27 -0500)]
Free port bindings when deleting cached ports.
Running test "ovn-controller-vtep binding 2" with address sanitizer
enabled resulted in a failure due to a memory leak. The cached switch
port's bindings were not being freed when the port was freed. The
fix is to destroy the bindings hash table when the switch port is
freed.
Signed-off-by: Mark Michelson <mmichels@redhat.com> Reported-by: Lance Richardson <lrichard@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
dp-packet: Remove misleading comment for refill init function.
Function 'dp_packet_batch_refill_init' doesn't return anything.
Looks like this comment came from one of the intermediate versions
of the API enhancement patch. Additionally comment style changed
to be consistent with other comments in the same file.
CC: Andy Zhou <azhou@ovn.org> Fixes: 72c84bc2db23 ("dp-packet: Enhance packet batch APIs.") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Andy Zhou <azhou@ovn.org>
Joe Stringer [Fri, 2 Jun 2017 16:38:47 +0000 (09:38 -0700)]
ofproto-dpif: Detect support for ct_tuple6.
Support for extracting original direction 5 tuple fields from the
connection tracking module may differ on some platforms between the IPv4
original tuple fields vs. IPv6. Detect IPv6 original tuple support
separately and reflect this support up to the OpenFlow layer.
Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>