Darrell Ball [Fri, 29 Jun 2018 06:39:47 +0000 (23:39 -0700)]
conntrack: Fix fragmentation checks.
The ipv4 fragmentation check is broken and allows fragments through.
There were fragile and poorly maintainable checks in extract_l3_ipv*
designed to save a few cycles. The checks make assumptions about what
sanity checks may have been done and could be skipped based on inferring
from the value of another paramater that should be unrelated (l4
pointer needing assignment). Since the benefit is minimal, remove
the special checks and always do sanity checks.
Four tests are added to better maintain fragmentation support.
This needs backporting to 2.9.
Fixes: c8b1ad49da68("conntrack: Reorder sanity checks in extract_l3_ipvx().") Fixes: a489b16854b5("conntrack: New userspace connection tracker.") Signed-off-by: Darrell Ball <dlu998@gmail.com>
Han Zhou [Mon, 25 Jun 2018 17:03:02 +0000 (10:03 -0700)]
ovn.at: Add stateful test for ACL on port groups.
A bug was reported on the feature of applying ACLs on port groups [1].
This bug was not detected by the original test case, because it didn't
test the return traffic and so didn't ensure the stateful feature is
working. The fix [2] causes the original test case fail, because
once the conntrack is enabled, the test packets are dropped because
the checksum in those packets are invalid and so marked with "invalid"
state by conntrack. To avoid the test case failure, the fix [2] changed
it to test stateless acl only, which leaves the scenario untested,
although it is fixed. This patch adds back the stateful ACL in the
test, and replaced the dummy/receive with inject-pkt to send the test
packets, so that checksums can be properly filled in, and it also
adds tests for the return traffic, which ensures the stateful is
working.
Signed-off-by: Han Zhou <hzhou8@ebay.com> Acked-by: Jakub Sitnicki <jkbs@redhat.com> Acked-by: Daniel Alvarez <dalvarez@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Daniel Alvarez [Wed, 20 Jun 2018 02:18:59 +0000 (04:18 +0200)]
ovn-northd: Apply pre ACLs when using Port Groups
When using Port Groups, the pre ACLs were not applied so the
conntrack action was not performed. This patch takes Port Groups
into account when processing the pre ACLs.
As a follow up, we could enhance this patch by creating an index
from lswitch to port groups.
Signed-off-by: Daniel Alvarez <dalvarez@redhat.com> Acked-by: Lucas Alvares Gomes <lucasagomes@gmail.com> Acked-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
aginwala [Sat, 9 Jun 2018 01:33:13 +0000 (18:33 -0700)]
ovndb-servers: Set connection table when using load balancer to manage ovndb clusters via pacemaker.
This is will allow setting inactivity probe on the master node.
For pacemaker to manage ovndb resources via LB, we skipped creating connection
table and hence the inactivity probe was getting set to 5000 by default.
In order to over-ride it we need this table. However, we need to skip slaves
listening on local sb and nb connections table so that LB feature is
intact and only master is listening on 0.0.0.0
e.g --remote=db:OVN_Southbound,SB_Global,connections and
--remote=db:OVN_Northbound,NB_Global,connections
will be skipped for slave SB and NB dbs respectively by unsetting
--db-sb-use-remote-in-db and --db-nb-use-remote-in-db in ovn-ctl.
Signed-off-by: aginwala <aginwala@ebay.com> Acked-by: Numan Siddique <nusiddiq@redhat.com> Acked-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
aginwala [Fri, 8 Jun 2018 19:32:22 +0000 (12:32 -0700)]
ovn-ctl: Support NB and SB DBs to start without using remote connections.
e.g --remote=db:OVN_Southbound,SB_Global,connections and
--remote=db:OVN_Northbound,NB_Global,connections
can be skipped for cases where slaves do not need to listen on nb and sb db
connection tables while using pacemaker with load balancer for ovndb clusters.
Signed-off-by: aginwala <aginwala@ebay.com> Acked-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ian Stokes [Wed, 4 Jul 2018 14:28:33 +0000 (15:28 +0100)]
db-ctl-base: Fix compilation warnings.
This commit fixes uninitialized variable warnings in functions
cmd_create() and cmd_get() when compiling with gcc 6.3.1 and -Werror
by initializing variables 'symbol' and 'new' to NULL.
Cc: Alex Wang <alexw@nicira.com> Fixes: 07ff77ccb82a ("db-ctl-base: Make common database command code into library.") Signed-off-by: Ian Stokes <ian.stokes@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ilya Maximets [Wed, 20 Jun 2018 07:44:51 +0000 (10:44 +0300)]
rconn: Suppress 'connected' log for unreliable connections.
Recent assertion failure fix changed rconn workflow for unreliable
connections (such as connections from ovs-ofctl) from
|rconn|DBG|br-int<->unix#151: entering ACTIVE
|rconn|DBG|br-int<->unix#151: connection closed by peer
|rconn|DBG|br-int<->unix#151: entering DISCONNECTED
To
|rconn|DBG|br-int<->unix#200: entering CONNECTING
|rconn|INFO|br-int<->unix#200: connected
|rconn|DBG|br-int<->unix#200: entering ACTIVE
|rconn|DBG|br-int<->unix#200: connection closed by peer
|rconn|DBG|br-int<->unix#200: entering DISCONNECTED
Many monitoring/configuring tools (ex. ovs-neutron-agent) uses
ovs-ofctl frequently to check the statuses of installed flows.
This produces a lot of "connected" logs, that are useless in general.
Fix that by changing the log level to DBG for unreliable connections.
Suggested-by: Ben Pfaff <blp@ovn.org> Fixes: c9a9b9b00bf5 ("rconn: Introduce new invariant to fix assertion failure in corner case.") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Tue, 3 Jul 2018 18:32:18 +0000 (11:32 -0700)]
ofproto-macros: Ignore "Dropped # log messages" in check_logs.
check_logs ignores some log messages, but it wasn't smart enough to ignore
the messages that said that the ignored messages had been rate-limited.
This fixes the problem.
It's OK to ignore all rate-limiting messages because they only appear if at
least one message was not rate-limited, which check_logs will catch anyway.
Jakub Sitnicki [Mon, 2 Jul 2018 10:50:09 +0000 (12:50 +0200)]
db-ctl-base: Don't die in ctl_set_column() on error.
Return the error message to the caller instead of reporting it and dying
so that the caller can handle the error without terminating the process
if needed.
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Jakub Sitnicki [Mon, 2 Jul 2018 10:50:08 +0000 (12:50 +0200)]
db-ctl-base: Don't die in pre_list_columns() on error.
Return the error message to the caller instead of reporting it and dying
so that the caller can handle the error without terminating the process
if needed.
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Jakub Sitnicki [Mon, 2 Jul 2018 10:50:07 +0000 (12:50 +0200)]
db-ctl-base: Don't die in pre_parse_column_key_value() on error.
Return the error message to the caller instead of reporting it and dying
so that the caller can handle the error without terminating the process
if needed.
Also, we no longer return the column as it was not used by any of
existing callers.
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Jakub Sitnicki [Mon, 2 Jul 2018 10:50:06 +0000 (12:50 +0200)]
db-ctl-base: Don't die in pre_get_table() on error.
Return the error message to the caller instead of reporting it and dying
so that the caller can handle the error without terminating the process
if needed.
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Jakub Sitnicki [Mon, 2 Jul 2018 10:50:05 +0000 (12:50 +0200)]
db-ctl-base: Don't die in pre_get_column() on error.
Return the error message to the caller instead of reporting it and dying
so that the caller can handle the error without terminating the process
if needed.
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Jakub Sitnicki [Mon, 2 Jul 2018 10:50:04 +0000 (12:50 +0200)]
db-ctl-base: Don't die in ctl_get_row() on error.
Return the error message to the caller instead of reporting it and dying
so that the caller can handle the error without terminating the process
if needed.
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Jakub Sitnicki [Mon, 2 Jul 2018 10:50:03 +0000 (12:50 +0200)]
db-ctl-base: Don't die in get_row_by_id() on multiple matches.
Signal that multiple rows match the record identifier via a new output
parameter instead of reporting the problem and dying, so that the caller
can handle the error without terminating the process if needed.
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Jakub Sitnicki [Mon, 2 Jul 2018 10:50:02 +0000 (12:50 +0200)]
db-ctl-base: Don't die in create_symbol() on error.
Return the error message to the caller instead of reporting it and dying
so that the caller can handle the error without terminating the process
if needed.
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Jakub Sitnicki [Mon, 2 Jul 2018 10:50:01 +0000 (12:50 +0200)]
db-ctl-base: Don't die in set_column() on error.
Return the error message to the caller instead of reporting it and dying
so that the caller can handle the error without terminating the process
if needed.
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Jakub Sitnicki [Mon, 2 Jul 2018 10:50:00 +0000 (12:50 +0200)]
db-ctl-base: Don't die in check_mutable() on error.
Return the error message to the caller instead of reporting it and dying
so that the caller can handle the error without terminating the process
if needed.
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Jakub Sitnicki [Mon, 2 Jul 2018 10:49:59 +0000 (12:49 +0200)]
db-ctl-base: Don't die in is_condition_satisfied() on error.
Return the error message to the caller instead of reporting it and dying
so that the caller can handle the error without terminating the process
if needed.
Also, rename the function as it is no longer a typical predicate, so
that the users don't assume that the result is passed in return value.
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Jakub Sitnicki [Mon, 2 Jul 2018 10:49:58 +0000 (12:49 +0200)]
db-ctl-base: Don't die in get_table() on error.
Return the error message to the caller instead of reporting it and dying
so that the caller can handle the error without terminating the process
if needed.
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Jakub Sitnicki [Mon, 2 Jul 2018 10:49:57 +0000 (12:49 +0200)]
db-ctl-base: Don't die in parse_column_names() on error.
Return the error message to the caller instead of reporting it and dying
so that the caller can handle the error without terminating the process
if needed.
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Aaron Conole [Thu, 28 Jun 2018 00:40:04 +0000 (20:40 -0400)]
checkpatch: fix patch separator line regex
The separator line always starts with three dashes on a line, optionally
followed by either white-space, OR a single space and a filename. The
regex would previously match on any three dashes in a row. This means
that a patch (such as [1]) would trigger the parser state machine to
advance beyond the signed-off checks.
Now, bound the check only to use what git-mailinfo would use as a
separator.
--- <filename>
---<sp>
Roi Dayan [Mon, 2 Jul 2018 09:07:58 +0000 (12:07 +0300)]
netdev-tc-offloads: Fix probing multi mask per prio
When adding TC rules we save the prio so can reuse same prio
for same mask since different mask will have to use different prio.
The multi mask per prio probe broke this by using a prio but
get_prio_for_tc_flower() didn't know about it.
Also multi mask per prio support changes the hash calculation.
It's best the probe will add and del the ingress qdisc to have a clean start
after it.
Signed-off-by: Roi Dayan <roid@mellanox.com> Acked-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
Greg Rose [Fri, 29 Jun 2018 18:18:13 +0000 (11:18 -0700)]
utilities: On RHEL 7 systems clean up after upgrade
When upgrading from older versions of OVS that used the built-in geneve
kernel module on RHEL 7 systems to newer versions that use the 'compat'
vport_geneve and vport_vxlan drivers we need to clean up some cruft
that might have been left over after the upgrade.
Remove any genev_sys_6081 and vxlan_sys_4789 interfaces and then if
the RHEL 7 geneve or vxlan built-in drivers are loaded remove them
before loading the new drivers.
Removing the geneve and vxlan built-in drivers will prevent occurrences
of the "unassociated datapath" errors that can sometimes occur in some
environments.
Greg Rose [Fri, 29 Jun 2018 03:31:26 +0000 (20:31 -0700)]
datapath: Add missing code in ip_tunnel_lookup()
The compat rpl_ip_tunnel_lookup() function was missing some code added
in Linux kernel release 4.3 but not backported in the initial commit.
This also allows us to remove an old hack in erspan_rcv() that was
zeroing out the key parameter so that the tunnel lookups wouldn't fail.
Fixes: 8e53509c ("gre: introduce native tunnel support for ERSPAN") Reported-by: William Tu <u9012063@gmail.com> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Justin Pettit <jpettit@ovn.org>
Greg Rose [Fri, 29 Jun 2018 03:31:25 +0000 (20:31 -0700)]
compat: Fix gre header bug
Commit 436d36db introduced a bug into the gre header build for gre and
ip gre type tunnels. __vlan_hwaccel_push_inside does not check whether
the vlan tag is even present. So check first and avoid padding space
for a vlan tag that isn't present.
Fixes: 436d36db ("compat: Fixups for newer kernels") Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Justin Pettit <jpettit@ovn.org>
OVN: do not mark ND packets for conntrack in PRE_LB stage
Do not send Neighbor Discovery packets to conntrack module if
load balancing rules have been added to NB db since otherwise
Neighbor Advertisement frames will be discarded by OVN.
In order to reproduce the issue it is enough to add 2 logical ports
to a single logical switch, assign an IPv6 address to each VIF, and
define a load balance rule on the logical switch. After a while the
ping6 from VIF1 to VIF2 will stop since the vm will not receive any NA
packet
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Darrell Ball [Thu, 28 Jun 2018 05:15:43 +0000 (22:15 -0700)]
ovn: Fix gateway load balancing.
Non-distributed and distributed gateway load balancing is broken.
Recent changes for port unreachable handling broke the associated
unsnat functionality. The fix approach is check for gateway
contexts and accept packets directed to gateway router IPs.
Fixes: 86558ac2e476 ("OVN: add UDP port unreachable support to OVN logical router.") Fixes: 159932c9e4ea ("OVN: add TCP port unreachable support to OVN logical router.") Fixes: 0e858e05f76b ("OVN: add protocol unreachable support to OVN router ports.") CC: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Gurucharan Shetty <guru@ovn.org>
John Hurley [Thu, 28 Jun 2018 16:03:07 +0000 (17:03 +0100)]
netdev-linux: monitor and offload LAG slaves to TC
A LAG slave cannot be added directly to an OvS bridge, nor can a OvS
bridge port be added to a LAG dev. However, LAG masters can be added to
OvS.
Use TC blocks to indirectly offload slaves when their master is attached
as a linux-netdev to an OvS bridge. In the kernel TC datapath, blocks link
together netdevs in a similar way to LAG devices. For example, if a filter
is added to a block then it is added to all block devices, or if stats are
incremented on 1 device then the stats on the entire block are incremented.
This mimics LAG devices in that if a rule is applied to the LAG master
then it should be applied to all slaves etc.
Monitor LAG slaves via the netlink socket in netdev-linux and, if their
master is attached to the OvS bridge and has a block id, add the slave's
qdisc to the same block. Similarly, if a slave is freed from a master,
remove the qdisc from the masters block.
Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
John Hurley [Thu, 28 Jun 2018 16:03:06 +0000 (17:03 +0100)]
netdev-linux: assign LAG devs to tc blocks
Assign block ids to LAG masters that are added to OvS as linux-netdevs and
offloaded via offload API calls. Only LAG masters are assigned to blocks.
To ensure uniqueness, the block ids are determined by the netdev ifindex.
Implement a get_block_id op for linux netdevs to achieve this.
Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
John Hurley [Thu, 28 Jun 2018 16:03:05 +0000 (17:03 +0100)]
netdev-linux: indicate if netdev is a LAG master
If a linux netdev is added to OvS that is a LAG master (for example, a
bond or team netdev) then record this in bool form in the dev struct. Use
the link info extracted from rtnetlink calls to determine this.
Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
John Hurley [Thu, 28 Jun 2018 16:03:04 +0000 (17:03 +0100)]
rtnetlink: extend parser to include kind of master and slave
Extend the rtnetlink_parse function to look for linkinfo attributes and,
in turn, store pointers to the master and slave kinds (if any) in the
rtnetlink_change struct.
Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
John Hurley [Thu, 28 Jun 2018 16:03:03 +0000 (17:03 +0100)]
netdev-provider: add class op to get block_id
Add a new class op for netdevs to get the block_id if one exists. The
block_id is used in offload ops to group multiple qdiscs together.
Stub calls are made to the new class op (implementation to follow in
further patches). The default block_id of 0 (no block) will be used in
these cases.
Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
John Hurley [Thu, 28 Jun 2018 16:03:02 +0000 (17:03 +0100)]
tc: allow offloading of block ids
Blocks, in tc classifiers, allow the grouping of multiple qdiscs with an
associated block id. Whenever a filter is added to/removed from this
block, the filter is added to/removed from all associated qdiscs.
Extend TC offload functions to take a block id as a parameter. If the id
is zero then the dqisc is not considered part of a block.
Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
Ben Pfaff [Wed, 27 Jun 2018 14:07:49 +0000 (07:07 -0700)]
ofp-meter: Fix ofp_print_meter_flags() output.
It had a missing space.
CC: Yifeng Sun <pkusunyifeng@gmail.com> Fixes: 61677bf976e9 ("ofp-meter: Fix ds_put_format that treats enum type as short integer") Signed-off-by: Ben Pfaff <blp@ovn.org>
Yifeng Sun [Tue, 26 Jun 2018 21:23:49 +0000 (14:23 -0700)]
ofp-meter: Fix ds_put_format that treats enum type as short integer
Travis job fails because of the below error and this patch solves this issue.
lib/ofp-meter.c:340:48: error: format specifies type 'unsigned short'
but the argument has underlying type 'unsigned int' [-Werror,-Wformat]
ds_put_format(s, "flags:0x%"PRIx16" ", flags);
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
During the investigation of a kernel panic, we encountered a condition
that triggered a kernel panic due to a large skb with an unusual
geometry. Inside of the STT codepath, an effort is made to linearize
such packets to avoid trouble during both fragment reassembly and
segmentation in the linux networking core.
As currently implemented, kernels with CONFIG_SLUB defined will skip
this process because it does not expect an skb with a frag_list to be
present. This patch removes the assumption, and allows these skb to
be linearized as intended. We confirmed this corrects the panic we
encountered.
Aaron Conole [Wed, 20 Jun 2018 18:40:58 +0000 (14:40 -0400)]
checkpatch: Only consider certain signoffs
Formatted patches can contain a heirarchy of sign-offs. This is true when
merging patches from different projects (eg. backports to the datapath
directory from the linux net project).
This means that a submitted backport will contain multiple signed-off
tags, and not all should be considered.
This commit updates checkpatch to only consider those signoff lines which
start at the beginning of a line. So the following:
Signed-off-by: Foo Bar <foo@bar.com>
should not trigger.
Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Anand Kumar [Fri, 22 Jun 2018 17:09:27 +0000 (10:09 -0700)]
datapath-windows: Compute ct hash based on 5-tuple and zone
Conntrack 5-tuple consists of src address, dst address, src port,
dst port and protocol which will be unique to a ct session.
Use this information along with zone to compute hash.
Also re-factor conntrack code related to parsing netlink attributes.
Testing:
Verified loading/unloading the driver with driver verified enabled.
Ran TCP/UDP and ICMP traffic.
Signed-off-by: Anand Kumar <kumaranand@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@ovn.org> Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
Anand Kumar [Fri, 22 Jun 2018 17:09:26 +0000 (10:09 -0700)]
datapath-windows: Implement locking in conntrack NAT.
This patch primarily replaces existing ndis RWlock based implementaion
for NAT in conntrack with a spinlock based implementation inside NAT,
module along with some conntrack optimization.
- The 'ovsNatTable' and 'ovsUnNatTable' tables are shared
between cleanup threads and packet processing thread.
In order to protect these two tables use a spinlock.
Also introduce counters to track number of nat entries.
- Introduce a new function OvsGetTcpHeader() to retrieve TCP header
and payload length, to optimize for TCP traffic.
- Optimize conntrack look up.
- Remove 'bucketlockRef' member from conntrack entry structure.
Testing:
Verified loading/unloading the driver with driver verified enabled.
Ran TCP/UDP and ICMP traffic.
Signed-off-by: Anand Kumar <kumaranand@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@ovn.org> Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
Anand Kumar [Fri, 22 Jun 2018 17:09:25 +0000 (10:09 -0700)]
datapath-windows: Use spinlock instead of RW lock for ct entry
This patch mainly changes a ndis RW lock for conntrack entry to a
spinlock along with some minor refactor in conntrack. Using
spinlock instead of RW lock as RW locks causes performance hits
when acquired/released multiple times.
- Use NdisInterlockedXX wrapper api's instead of InterlockedXX.
- Update 'ctTotalRelatedEntries' using interlocked functions.
- Move conntrack lock out of NAT module.
Testing:
Verified loading/unloading the driver with driver verified enabled.
Ran TCP/UDP and ICMP traffic.
Signed-off-by: Anand Kumar <kumaranand@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@ovn.org> Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
Eelco Chaudron [Wed, 20 Jun 2018 09:04:03 +0000 (11:04 +0200)]
utilities: Add the ovs_show_fdb command to gdb
This adds the ovs_show_fdb command:
Usage: ovs_show_fdb {<bridge_name> {dbg} {hash}}
<bridge_name> : Optional bridge name, if not supplied FDB summary
information is displayed for all bridges.
dbg : Will show structure address information
hash : Will display the forwarding table using the hash
table, rather than the rlu list.
Signed-off-by: Andy Zhou <azhou@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Justin Pettit <jpettit@ovn.org> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Tested-by: Greg Rose <gvrose8192@gmail.com>
Justin Pettit [Tue, 19 Jun 2018 21:10:17 +0000 (14:10 -0700)]
datapath: Fix compiler warning for HAVE_RHEL7_MAX_MTU.
Fixes: 1e40b541bc ("datapath: Fix max MTU size on RHEL 7.5 kernel") Signed-off-by: Justin Pettit <jpettit@ovn.org> Reviewed-by: Greg Rose <gvrose8192@gmail.com>
ovn: Fix DHCP classless static route for non-classful masks.
When trying to determine how many bytes of ip address needs to be included
in classless static route option, we should take into consideration the
following. To get the correct amount of bytes we need to take number of
network bits in the mask and divide it by 8. But if the mask has a
remainder when divided, we need to not ignore this and add 1 byte to the to
the length of the option.
Signed-off-by: Rostyslav Fridman <rostyslav_fridman@epam.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Lorenzo Bianconi [Mon, 18 Jun 2018 11:56:00 +0000 (13:56 +0200)]
OVN: add protocol unreachable support to OVN router ports
Add priority-70 flows to generate ICMP protocol unreachable messages
in reply to packets directed to the router's IP address on IP protocols
other than UDP, TCP, and ICMP
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Lorenzo Bianconi [Mon, 18 Jun 2018 11:55:59 +0000 (13:55 +0200)]
OVN: add TCP port unreachable support to OVN logical router
Add priority-80 flows to generate TCP reset messages in reply to
TCP datagrams directed to the router's IP address since the
logical router doesn't accept any TCP traffic
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Lorenzo Bianconi [Mon, 18 Jun 2018 11:55:58 +0000 (13:55 +0200)]
OVN: add UDP port unreachable support to OVN logical router
Add priority-80 flows to generate ICMP port unreachable messages in
reply to UDP datagrams directed to the router's IP address since the
logical router doesn't accept any UDP traffic
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Han Zhou [Wed, 30 May 2018 17:08:26 +0000 (10:08 -0700)]
ovsdb-idl: Remove unnecessary code in track clear.
In ovsdb_idl_db_track_clear(), it needs to free the deleted row.
However, it unnecessary to call ovsdb_idl_row_clear_old(), because
this has been called in ovsdb_idl_row_destroy(). It is also confusing
because it is called only if:
if (ovsdb_idl_row_is_orphan(row))
This is contradict with the check in ovsdb_idl_row_clear_old():
if (!ovsdb_idl_row_is_orphan(row))
(Currently the tracked row doesn't maintain any data, so there is no
leak.)
Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Kyle Simpson [Wed, 6 Jun 2018 14:17:59 +0000 (15:17 +0100)]
ofp-actions: Build action_set in one scan of action_list.
The previous implementation scans the action set of each WRITE_ACTIONS
command 13--17 times when moving the actions over. This change builds
up the list as a single scan, which should be more efficient.
Signed-off-by: Kyle Simpson <kyleandrew.simpson@gmail.com> Co-authored-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Thu, 14 Jun 2018 20:43:55 +0000 (13:43 -0700)]
ovs-sim: Don't install manpage at all (except from ovs-sim itself).
ovs-sim is a funny utility since it only works from a build tree, not from
an installed OVS. That means that we shouldn't install its manpage when
we run "make install". But we do want to install the manpage when we're
inside ovs-sim itself, so that the user can invoke "man ovs-sim" from its
nested shell.
This commit makes this happen.
Suggested-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Roi Dayan <roid@mellanox.com>
Ben Pfaff [Mon, 18 Jun 2018 18:45:23 +0000 (11:45 -0700)]
ovn-northd: Always allocate ipam_info for an ovn_datapath.
Until now, the ipam_info struct for a datapath has been allocated on
demand. This leads to slightly complication in the code in places, and
there is hardly any benefit since ipam_info is only about 48 bytes anyway.
This commit just inlines it into struct ovn_datapath.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mark Michelson <mmichels@redhat.com>
ofproto: Fix OVS crash when reverting old flows in bundle commit
During bundle commit flows which are added in bundle are applied
to ofproto in-order. In case if a flow cannot be added (e.g. flow
action is go-to group id which does not exist), OVS tries to
revert back all previous flows which were successfully applied
from the same bundle. This is possible since OVS maintains list
of old flows which were replaced by flows from the bundle.
While reinserting old flows ovs asserts due to check on rule
state != RULE_INITIALIZED. This will work only for new flows, but
for old flow the rule state will be RULE_REMOVED. This is causing
an assert and OVS crash.
The ovs assert check should be modified to != RULE_INSERTED to prevent
any existing rule being re-inserted and allow new rules and old rules
(in case of revert) to get inserted.
Here is an example to trigger the assert:
$ ovs-vsctl add-br br-test -- set Bridge br-test datapath_type=netdev
First flow rule will be modified since it is a valid rule. However second
rule is invalid since no group with id 10 exists. Bundle commit tries to
revert (insert) the first rule to old flow which results in ovs_assert at
ofproto_rule_insert__() since old rule->state = RULE_REMOVED.
Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Gavi Teitz [Thu, 7 Jun 2018 06:36:59 +0000 (09:36 +0300)]
dpctl: Properly reflect a rule's offloaded to HW state
Previously, any rule that is offloaded via a netdev, not necessarily
to the HW, would be reported as "offloaded". This patch fixes this
misalignment, and introduces the 'dp' state, as follows:
rule is in HW via TC offload -> offloaded=yes dp:tc
rule is in not HW over TC DP -> offloaded=no dp:tc
rule is in not HW over OVS DP -> offloaded=no dp:ovs
To achieve this, the flows's 'offloaded' flag was encapsulated in a new
attrs struct, which contains the offloaded state of the flow and the
DP layer the flow is handled in, and instead of setting the flow's
'offloaded' state based solely on the type of dump it was acquired
via, for netdev flows it now sends the new attrs struct to be
collected along with the rest of the flow via the netdev, allowing
it to be set per flow.
For TC offloads, the offloaded state is set based on the 'in_hw' and
'not_in_hw' flags received from the TC as part of the flower. If no
such flag was received, due to lack of kernel support, it defaults
to true.
Signed-off-by: Gavi Teitz <gavi@mellanox.com> Acked-by: Roi Dayan <roid@mellanox.com>
[simon: resolved conflict in lib/dpctl.man] Signed-off-by: Simon Horman <simon.horman@netronome.com>
Aaron Conole [Fri, 1 Jun 2018 18:28:49 +0000 (14:28 -0400)]
rhel: selinux-policy to invoke proper label macros
The rpm doesn't invoke all of the required selinux helpers to enact labeling
or relabeling on all versions of Fedora/RHEL. According to:
https://fedoraproject.org/wiki/SELinux/IndependentPolicy
This commit switches to use the selinux rpm macros which will ensure that
all of the labels defined in the .fc.in file are applied properly.
This commit uses the previously defined selinux label to transition
from the openvswitch_t to openvswitch_load_module_t domain by
executing ovs-kmod-ctl that is labelled with
openvswitch_load_module_exec_t type.
Note that unless the selinux relabel operation is invoked, the script
will not be labelled. This merely instructs the selinux tools that
ovs-kmod-ctl should have a label applied.
Aaron Conole [Fri, 1 Jun 2018 18:28:46 +0000 (14:28 -0400)]
selinux: allow openvswitch_t net_broadcast and net_raw
The ovs-vswitchd daemon requires both CAP_NET_RAW and
CAP_NET_BROADCAST, but these are generally policy prevented by
selinux. This allows these capabilities to be retained by the
openvswitch_t domain.
Aaron Conole [Fri, 1 Jun 2018 18:28:45 +0000 (14:28 -0400)]
selinux: create a transition type for module loading
Defines a type 'openvswitch_load_module_t' used exclusively for loading
modules. This means that the 'openvswitch_t' domain won't require
access to the module loading facility - such access can only happen
after transitioning through the 'openvswitch_load_module_exec_t'
transition context.
A future commit will instruct the selinux policy on how to label the
appropriate script with extended attributes to make use of this new domain.
Aaron Conole [Fri, 1 Jun 2018 18:28:44 +0000 (14:28 -0400)]
ovs-kmod-ctl: introduce a kernel module load script
Currently, Open vSwitch on linux embeds the logic of loading and unloading
kernel modules into the ovs-ctl and ovs-lib script files. This works, but
it means that there is no way to leverage extended filesystem attributes
to grant fine grain permissions relating to module loading.
The split out utility 'ovs-kmod-ctl' will be used in an upcoming commit
for RHEL-based distributions to have a separate transition domain that
will allow module loading to be given to a separate selinux domain from
the openvswitch_t domain.
Mark Michelson [Thu, 17 May 2018 17:16:55 +0000 (13:16 -0400)]
ovsdb-idl: Correct singleton insert logic
When inserting data into a "singleton" table (one that has maxRows ==
1), there is a check that ensures that the table is currently empty
before inserting the row. The intention is to prevent races where
multiple clients might attempt to insert rows at the same time.
The problem is that this singleton check can cause legitimate
transactions to fail. Specifically, a transaction that attempts to
delete the current content of the table and insert new data will cause
the singleton check to fail since the table currently has data.
This patch corrects the issue by keeping a count of the rows being
deleted and added to singleton tables. If the total is larger than zero,
then the net operation is attempting to insert rows. If the total is
less than zero, then the net operation is attempting to remove rows. If
the total is zero, then the operation is inserting and deleting an equal
number of rows (or is just updating rows). We only add the singleton
check if the total is larger than zero.
This patch also includes a new test for singleton tables that ensures
that the maxRows constraint works as expected.
Signed-off-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Aaron Conole [Fri, 15 Jun 2018 13:20:12 +0000 (09:20 -0400)]
netdev-dpdk: Avoid warning for snprintf() call.
lib/netdev-dpdk.c: In function :
lib/netdev-dpdk.c:2865:49: warning: output may be truncated before the last format character [-Wformat-truncation=]
snprintf(vhost_vring, 16, "vring_%d_size", i);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Suggested-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Ilya Maximets <i.maximets@samsung.com>
Justin Pettit [Thu, 14 Jun 2018 02:12:31 +0000 (19:12 -0700)]
ovs-dpctl: Remove redundant documentation from man page.
Remove descriptions of options that are already described with the
command. These options were not staying current with the commands that
supported them.
Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Justin Pettit [Thu, 14 Jun 2018 01:29:49 +0000 (18:29 -0700)]
dpctl: Prefer "--more" to indicate verbosity for "ct-stats-show".
The "ct-stats-show" used the keyword "verbose" to indicate verbosity,
but the more standard way in OVS is to use "-m" or "--more". This
commit continues to support the keyword method, but adds support for
"-m" and "--more" and documents their use.
Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Ivan Dyukov [Tue, 5 Jun 2018 14:37:42 +0000 (17:37 +0300)]
tests/stp: Make validation of flows before changing of topology.
The change fixes random stp test failure. Accuracy is about 20%.
Failed test is following:
2337: STP - flush the fdb and mdb when topology changed
In some cases, a validation is executed after topology change and
it increase time of stp stabilization. To prevent this, delay
which wait validation is added before deleting a port.
CC: Tonghao Zhang <xiangxia.m.yue@gmail.com> Fixes: 427e9751f300 ("tests: Add and improve stp tests.") Signed-off-by: Ivan Dyukov <i.dyukov@samsung.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Flavio Leitner [Thu, 7 Jun 2018 14:10:52 +0000 (11:10 -0300)]
linux: Assume it is local if no API is available.
If the 'openvswitch' kernel module is not loaded, the API is not
available and the userspace will keep retrying. This approach is
not ideal for the netdev datapath type.
This patch disables network netns support if the error code returned
indicates that the API is not available.
Timothy Redaelli [Mon, 11 Jun 2018 11:15:35 +0000 (13:15 +0200)]
tests: Fix test that tests if the system doesn't support IPv6
Currently if IPv6 is globally disabled (net.ipv6.conf.all.disable_ipv6=1) or
if IPv6 is disabled on loopback interface (net.ipv6.conf.lo.disable_ipv6=1)
the check doesn't work since no interface have ::1 and EADDRNOTAVAIL is
returned.
This causes a Python exception to be printed, like this:
Traceback (most recent call last):
File "<string>", line 6, in <module>
File "/usr/lib64/python2.7/socket.py", line 228, in meth
return getattr(self._sock,name)(*args)
socket.error: [Errno 99] Cannot assign requested address
In this case HAVE_IPV6 is not set and all IPv6 tests fails.
This commit fixes the problem by check also for EADDRNOTAVAIL.
CC: Ben Pfaff <blp@ovn.org> Fixes: 5c1d812d7fb3 ("tests: Avoid printing Python exception for hosts without IPv6 support.") Signed-off-by: Timothy Redaelli <tredaelli@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>