]> git.proxmox.com Git - mirror_ubuntu-kernels.git/log
mirror_ubuntu-kernels.git
6 years agonet: Use static_key for XPS maps
Amritha Nambiar [Sat, 30 Jun 2018 04:26:46 +0000 (21:26 -0700)]
net: Use static_key for XPS maps

Use static_key for XPS maps to reduce the cost of extra map checks,
similar to how it is used for RPS and RFS. This includes static_key
'xps_needed' for XPS and another for 'xps_rxqs_needed' for XPS using
Rx queues map.

Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: Refactor XPS for CPUs and Rx queues
Amritha Nambiar [Sat, 30 Jun 2018 04:26:41 +0000 (21:26 -0700)]
net: Refactor XPS for CPUs and Rx queues

Refactor XPS code to support Tx queue selection based on
CPU(s) map or Rx queue(s) map.

Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'mlxsw-Add-resource-scale-tests'
David S. Miller [Sat, 30 Jun 2018 13:06:16 +0000 (22:06 +0900)]
Merge branch 'mlxsw-Add-resource-scale-tests'

Petr Machata says:

====================
mlxsw: Add resource scale tests

There are a number of tests that check features of the Linux networking
stack. By running them on suitable interfaces, one can exercise the
mlxsw offloading code. However none of these tests attempts to push
mlxsw to the limits supported by the ASIC.

As an additional wrinkle, the "limits supported by the ASIC" themselves
may not be a set of fixed numbers, but rather depend on a profile that
determines how the ASIC resources are allocated for different purposes.

This patchset introduces several tests that verify capability of mlxsw
to offload amounts of routes, flower rules, and mirroring sessions that
match predicted ASIC capacity, at different configuration profiles.
Additionally they verify that amounts exceeding the predicted capacity
can *not* be offloaded.

These are not generic tests, but ones that are tailored for mlxsw
specifically. For that reason they are not added to net/forwarding
selftests subdirectory, but rather to a newly-added drivers/net/mlxsw.

Patches #1, #2 and #3 tweak the generic forwarding/lib.sh to support the
new additions.

In patches #4 and #5, new libraries for interfacing with devlink are
introduced, first a generic one, then a Spectrum-specific one.

In patch #6, a devlink resource test is introduced.

Patches #7 and #8, #9 and #10, and #11 and #12 introduce three scale
tests: router, flower and mirror-to-gretap. The first of each pair of
patches introduces a generic portion of the test (mlxsw-specific), the
second introduces a Spectrum-specific wrapper.

Patch #13 then introduces a scale test driver that runs (possibly a
subset of) the tests introduced by patches from previous paragraph.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: mlxsw: Add scale test for resources
Yuval Mintz [Sat, 30 Jun 2018 00:53:52 +0000 (02:53 +0200)]
selftests: mlxsw: Add scale test for resources

Add a scale test capable of validating that offloaded network
functionality is indeed functional at scale when configured to
the different KVD profiles available.

Start by testing offloaded routes are functional at scale by
passing traffic on each one of them in turn.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: mlxsw: Add target for mirror-to-gretap test on spectrum
Petr Machata [Sat, 30 Jun 2018 00:53:14 +0000 (02:53 +0200)]
selftests: mlxsw: Add target for mirror-to-gretap test on spectrum

Add a wrapper around mlxsw/mirror_gre_scale.sh that parameterized number
of offloadable mirrors on Spectrum machines.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: mlxsw: Add scale test for mirror-to-gretap
Petr Machata [Sat, 30 Jun 2018 00:52:34 +0000 (02:52 +0200)]
selftests: mlxsw: Add scale test for mirror-to-gretap

Test that it's possible to offload a given number of mirrors.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: mlxsw: Add target for tc flower test on spectrum
Petr Machata [Sat, 30 Jun 2018 00:51:55 +0000 (02:51 +0200)]
selftests: mlxsw: Add target for tc flower test on spectrum

Add a wrapper around mlxsw/tc_flower_scale.sh that parameterizes the
generic tc flower scale test template with Spectrum-specific target
values.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: mlxsw: Add tc flower scale test
Petr Machata [Sat, 30 Jun 2018 00:51:05 +0000 (02:51 +0200)]
selftests: mlxsw: Add tc flower scale test

Add test of capacity to offload flower.

This is a generic portion of the test that is meant to be called from a
driver that supplies a particular number of rules to be tested with.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: mlxsw: Add target for router test on spectrum
Yuval Mintz [Sat, 30 Jun 2018 00:50:28 +0000 (02:50 +0200)]
selftests: mlxsw: Add target for router test on spectrum

IPv4 routes in Spectrum are based on the kvd single-hash, but as it's
a hash we need to assume we cannot reach 100% of its capacity.

Add a wrapper that provides us with good/bad target numbers for the
Spectrum ASIC.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
[petrm@mellanox.com: Drop shebang.]
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: mlxsw: Add router test
Arkadi Sharshevsky [Sat, 30 Jun 2018 00:49:32 +0000 (02:49 +0200)]
selftests: mlxsw: Add router test

This test aims for both stand alone and internal usage by the resource
infra. The test receives the number routes to offload and checks:
- The routes were offloaded correctly
- Traffic for each route.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: mlxsw: Add devlink KVD resource test
Yuval Mintz [Sat, 30 Jun 2018 00:48:27 +0000 (02:48 +0200)]
selftests: mlxsw: Add devlink KVD resource test

Add a selftest that can be used to perform basic sanity of the devlink
resource API as well as test the behavior of KVD manipulation in the
driver.

This is the first case of a HW-only test - in order to test the devlink
resource a driver capable of exposing resources has to be provided
first.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
[petrm@mellanox.com: Extracted two patches out of this patch. Tweaked
commit message.]
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: mlxsw: Add devlink_lib_spectrum.sh
Petr Machata [Sat, 30 Jun 2018 00:48:12 +0000 (02:48 +0200)]
selftests: mlxsw: Add devlink_lib_spectrum.sh

This library builds on top of devlink_lib.sh and contains functionality
specific to Spectrum ASICs, e.g., re-partitioning the various KVD
sub-parts.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
[petrm@mellanox.com: Split this out from another patch. Fix line length
in devlink_sp_read_kvd_defaults().]
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: forwarding: Add devlink_lib.sh
Petr Machata [Sat, 30 Jun 2018 00:47:08 +0000 (02:47 +0200)]
selftests: forwarding: Add devlink_lib.sh

This helper library contains wrappers to devlink functionality agnostic
to the underlying device.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
[petrm@mellanox.com: Split this out from another patch.]
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: forwarding: lib: Parameterize NUM_NETIFS in two functions
Petr Machata [Sat, 30 Jun 2018 00:46:05 +0000 (02:46 +0200)]
selftests: forwarding: lib: Parameterize NUM_NETIFS in two functions

setup_wait() and tc_offload_check() both assume that all NUM_NETIFS
interfaces are relevant for a given test. However, the scale test script
acts as an umbrella for a number of sub-tests, some of which may not
require all the interfaces.

Thus it's suboptimal for tc_offload_check() to query all the interfaces.
In case of setup_wait() it's incorrect, because the sub-test in question
of course doesn't configure any interfaces beyond what it needs, and
setup_wait() then ends up waiting indefinitely for the extraneous
interfaces to come up.

For that reason, give setup_wait() and tc_offload_check() an optional
parameter with a number of interfaces to probe. Fall back to global
NUM_NETIFS if the parameter is not given.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: forwarding: lib: Add check_err_fail()
Petr Machata [Sat, 30 Jun 2018 00:45:47 +0000 (02:45 +0200)]
selftests: forwarding: lib: Add check_err_fail()

In the scale testing scenarios, one usually has a condition that is
expected to either fail, or pass, depending on which side of the scale
is being tested.

To capture this logic, add a function check_err_fail(), which dispatches
either to check_err() or check_fail(), depending on the value of the
first argument, should_fail.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: forwarding: Allow lib.sh sourcing from other directories
Yuval Mintz [Sat, 30 Jun 2018 00:45:19 +0000 (02:45 +0200)]
selftests: forwarding: Allow lib.sh sourcing from other directories

The devlink related scripts are mlxsw-specific. As a result, they'll
reside in a different directory - but would still need the common logic
implemented in lib.sh.
So as a preliminary step, allow lib.sh to be sourced from other
directories as well.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'nfp-flower-updates-and-netconsole'
David S. Miller [Sat, 30 Jun 2018 12:31:57 +0000 (21:31 +0900)]
Merge branch 'nfp-flower-updates-and-netconsole'

Jakub Kicinski says:

====================
nfp: flower updates and netconsole

This set contains assorted updates to driver base and flower.
First patch is a follow up to a fix to calculating counters which
went into net.  For ethtool counters we should also make sure
they are visible even after ring reconfiguration.  Next patch
is a safety measure in case we are running on a machine with a
broken BIOS we should fail the probe when risk of incorrect
operation is too high.  The next two patches add netpoll support
and make use of napi_consume_skb().  Last we populate bus info
on all representors.

Pieter adds support for offload of the checksum action in flower.

John follows up to another fix he's done in net, we set TTL
values on tunnels to stack's default, now Johns does a full
route lookup to get a more precise information, he populates
ToS field as well.  Last but not least he follows up on Jiri's
request to enable LAG offload in case the team driver is used
and then hash function is unknown.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: flower: enabled offloading of Team LAG
John Hurley [Sat, 30 Jun 2018 00:04:42 +0000 (17:04 -0700)]
nfp: flower: enabled offloading of Team LAG

Currently the NFP fw only supports L3/L4 hashing so rejects the offload of
filters that output to LAG ports implementing other hash algorithms. Team,
however, uses a BPF function for the hash that is not defined. To support
Team offload, accept hashes that are defined as 'unknown' (only Team
defines such hash types). In this case, use the NFP default of L3/L4
hashing for egress port selection.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: flower: offload tos and tunnel flags for ipv4 udp tunnels
John Hurley [Sat, 30 Jun 2018 00:04:41 +0000 (17:04 -0700)]
nfp: flower: offload tos and tunnel flags for ipv4 udp tunnels

Extract the tos and the tunnel flags from the tunnel key and offload these
action fields. Only the checksum and tunnel key flags are implemented in
fw so reject offloads of other flags. The tunnel key flag is always
considered set in the fw so enforce that it is set in the rule. Note that
the compulsory setting of the tunnel key flag and optional setting of
checksum is inline with how tc currently generates ipv4 udp tunnel
actions.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: flower: extract ipv4 udp tunnel ttl from route
John Hurley [Sat, 30 Jun 2018 00:04:40 +0000 (17:04 -0700)]
nfp: flower: extract ipv4 udp tunnel ttl from route

Previously the ttl for ipv4 udp tunnels was set to the namespace default.
Modify this to attempt to extract the ttl from a full route lookup on the
tunnel destination. If this is not possible then resort to the default.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: flower: ignore checksum actions when performing pedit actions
Pieter Jansen van Vuuren [Sat, 30 Jun 2018 00:04:39 +0000 (17:04 -0700)]
nfp: flower: ignore checksum actions when performing pedit actions

Hardware will automatically update csum in headers when a set action has
been performed. This means we could in the driver ignore the explicit
checksum action when performing a set action.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: populate bus-info on representors
Jakub Kicinski [Sat, 30 Jun 2018 00:04:38 +0000 (17:04 -0700)]
nfp: populate bus-info on representors

We used to leave bus-info in ethtool driver info empty for
representors in case multi-PCIe-to-single-host cards make
the association between PCIe device and NFP many to one.
It seems these attempts are futile, we need to link the
representors to one PCIe device in sysfs to get consistent
naming, plus devlink uses one PCIe as a handle, anyway.
The multi-PCIe-to-single-system support won't be clean,
if it ever comes.

Turns out some user space (RHEL tests) likes to read bus-info
so just populate it.

While at it remove unnecessary app NULL-check, representors
are spawned by an app, so it must exist.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: make use of napi_consume_skb()
Jakub Kicinski [Sat, 30 Jun 2018 00:04:37 +0000 (17:04 -0700)]
nfp: make use of napi_consume_skb()

Use napi_consume_skb() in nfp_net_tx_complete() to get bulk free.
Pass 0 as budget for ctrl queue completion since it runs out of
a tasklet.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: implement netpoll ndo (thus enabling netconsole)
Jakub Kicinski [Sat, 30 Jun 2018 00:04:36 +0000 (17:04 -0700)]
nfp: implement netpoll ndo (thus enabling netconsole)

NFP NAPI handling will only complete the TXed packets when called
with budget of 0, implement ndo_poll_controller by scheduling NAPI
on all TX queues.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: fail probe if serial or interface id is missing
Jakub Kicinski [Sat, 30 Jun 2018 00:04:35 +0000 (17:04 -0700)]
nfp: fail probe if serial or interface id is missing

On some platforms with broken ACPI tables we may not have access
to the Serial Number PCIe capability.  This capability is crucial
for us for switchdev operation as we use serial number as switch ID,
and for communication with management FW where interface ID is used.

If we can't determine the Serial Number we have to fail device probe.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: expose ring stats of inactive rings via ethtool
Jakub Kicinski [Sat, 30 Jun 2018 00:04:34 +0000 (17:04 -0700)]
nfp: expose ring stats of inactive rings via ethtool

After user changes the ring count statistics for deactivated
rings disappear from ethtool -S output.  This causes loss of
information to the user and means that ethtool stats may not
add up to interface stats.  Always expose counters from all
the rings.  Note that we allocate at most num_possible_cpus()
rings so number of rings should be reasonable.

The alternative of only listing stats for rings which were
ever in use could be confusing.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agostrparser: Call skb_unclone conditionally
Vakul Garg [Fri, 29 Jun 2018 19:15:55 +0000 (00:45 +0530)]
strparser: Call skb_unclone conditionally

Calling skb_unclone() is expensive as it triggers a memcpy operation.
Instead of calling skb_unclone() unconditionally, call it only when skb
has a shared frag_list. This improves tls rx throughout significantly.

Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Suggested-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotc-testing: initial version of tunnel_key unit tests
Keara Leibovitz [Fri, 29 Jun 2018 14:47:31 +0000 (10:47 -0400)]
tc-testing: initial version of tunnel_key unit tests

Create unittests for the tc tunnel_key action.

v2:
For the tests expecting failures, added non-zero exit codes in the
teardowns. This prevents those tests from failing if the act_tunnel_key
module is unloaded.

Signed-off-by: Keara Leibovitz <kleib@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge tag 'mac80211-next-for-davem-2018-06-29' of git://git.kernel.org/pub/scm/linux...
David S. Miller [Sat, 30 Jun 2018 12:08:12 +0000 (21:08 +0900)]
Merge tag 'mac80211-next-for-davem-2018-06-29' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Small merge conflict in net/mac80211/scan.c, I preserved
the kcalloc() conversion. -DaveM

Johannes Berg says:

====================
This round's updates:
 * finally some of the promised HE code, but it turns
   out to be small - but everything kept changing, so
   one part I did in the driver was >30 patches for
   what was ultimately <200 lines of code ... similar
   here for this code.
 * improved scan privacy support - can now specify scan
   flags for randomizing the sequence number as well as
   reducing the probe request element content
 * rfkill cleanups
 * a timekeeping cleanup from Arnd
 * various other cleanups
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotipc: extend sock diag for group communication
GhantaKrishnamurthy MohanKrishna [Fri, 29 Jun 2018 11:26:18 +0000 (13:26 +0200)]
tipc: extend sock diag for group communication

This commit extends the existing TIPC socket diagnostics framework
for information related to TIPC group communication.

Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: GhantaKrishnamurthy MohanKrishna <mohan.krishna.ghanta.krishnamurthy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotipc: Auto removal of peer down node instance
GhantaKrishnamurthy MohanKrishna [Fri, 29 Jun 2018 11:23:41 +0000 (13:23 +0200)]
tipc: Auto removal of peer down node instance

A peer node is considered down if there are no
active links (or) lost contact to the node. In current implementation,
a peer node instance is deleted either if

a) TIPC module is removed (or)
b) Application can use a netlink/iproute2 interface to delete a
specific down node.

Thus, a down node instance lives in the system forever, unless the
application explicitly removes it.

We fix this by deleting the nodes which are down for
a specified amount of time (5 minutes).
Existing node supervision timer is used to achieve this.

Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: GhantaKrishnamurthy MohanKrishna <mohan.krishna.ghanta.krishnamurthy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agor8169: remove TBI 1000BaseX support
Heiner Kallweit [Fri, 29 Jun 2018 06:07:04 +0000 (08:07 +0200)]
r8169: remove TBI 1000BaseX support

The very first version of RTL8169 from 2002 (and only this one) has
support for a TBI 1000BaseX fiber interface. The TBI support in the
driver makes switching to phylib tricky, so best would be to get
rid of it. I found no report from anybody using a device with RTL8169
and fiber interface, also the vendor driver doesn't support this mode
(any longer).
So remove TBI support and bail out with a message if a card with
activated TBI is detected. If there really should be any user of it
out there, we could add a stripped-down version of the driver
supporting chip version 01 and TBI only (and maybe move it to
staging).

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotipc: optimize function tipc_node_timeout()
Tung Nguyen [Thu, 28 Jun 2018 20:39:25 +0000 (22:39 +0200)]
tipc: optimize function tipc_node_timeout()

In single-link usage, the function tipc_node_timeout() still iterates
over the whole link array to handle each link. Given that the maximum
number of bearers are 3, there are 2 redundant iterations with lock
grab/release. Since this function is executing very frequently it makes
sense to optimize it.

This commit adds conditional checking to exit from the loop if the
known number of configured links has already been accessed.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotipc: eliminate buffer cloning in function tipc_msg_extract()
Tung Nguyen [Thu, 28 Jun 2018 20:25:04 +0000 (22:25 +0200)]
tipc: eliminate buffer cloning in function tipc_msg_extract()

The function tipc_msg_extract() is using skb_clone() to clone inner
messages from a message bundle buffer. Although this method is safe,
it has an undesired effect that each buffer clone inherits the
true-size of the bundling buffer. As a result, the buffer clone
almost always ends up with being copied anyway by the message
validation function. This makes the cloning into a sub-optimization.

In this commit we take the consequence of this realization, and copy
each inner message to a separately allocated buffer up front in the
extraction function.

As a bonus we can now eliminate the two cases where we had to copy
re-routed packets that may potentially go out on the wire again.

Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: usb: Mark expected switch fall-throughs
Gustavo A. R. Silva [Thu, 28 Jun 2018 18:50:48 +0000 (13:50 -0500)]
net: usb: Mark expected switch fall-throughs

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: phy: realtek: add support for RTL8211
Heiner Kallweit [Thu, 28 Jun 2018 18:46:45 +0000 (20:46 +0200)]
net: phy: realtek: add support for RTL8211

In preparation of adding phylib support to the r8169 driver we need
PHY drivers for all chip-internal PHY types. Fortunately almost all
of them are either supported by the Realtek PHY driver already or work
with the genphy driver.
Still missing is support for the PHY of RTL8169s, it requires a quirk
to properly support 100Mbit-fixed mode. The quirk was copied from
r8169 driver which copied it from the vendor driver.
Based on the PHYID the internal PHY seems to be a RTL8211.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agor8169: use standard debug output functions
Heiner Kallweit [Thu, 28 Jun 2018 18:36:15 +0000 (20:36 +0200)]
r8169: use standard debug output functions

I see no need to define a private debug output symbol, let's use the
standard debug output functions instead. In this context also remove
the deprecated PFX define.

The one assertion is wrong IMO anyway, this code path is used also
by chip version 01.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'smc-pnetid-and-SMC-D-support'
David S. Miller [Sat, 30 Jun 2018 11:42:26 +0000 (20:42 +0900)]
Merge branch 'smc-pnetid-and-SMC-D-support'

Ursula Braun says:

====================
smc: pnetid and SMC-D support

SMC requires a configured pnet table to map Ethernet interfaces to
RoCE adapter ports. For s390 there exists hardware support to group
such devices. The first three patches cover the s390 pnetid support,
enabling SMC-R usage on s390 without configuring an extra pnet table.

SMC currently requires RoCE adapters, and uses RDMA-techniques
implemented with IB-verbs. But s390 offers another method for
intra-CEC Shared Memory communication. The following seven patches
implement a solution to run SMC traffic based on intra-CEC DMA,
called SMC-D.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agos390/ism: add device driver for internal shared memory
Sebastian Ott [Thu, 28 Jun 2018 17:05:13 +0000 (19:05 +0200)]
s390/ism: add device driver for internal shared memory

Add support for the Internal Shared Memory vPCI Adapter.
This driver implements the interfaces of the SMC-D protocol.

Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: add SMC-D diag support
Hans Wippel [Thu, 28 Jun 2018 17:05:12 +0000 (19:05 +0200)]
net/smc: add SMC-D diag support

This patch adds diag support for SMC-D.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Suggested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: add SMC-D support in af_smc
Hans Wippel [Thu, 28 Jun 2018 17:05:11 +0000 (19:05 +0200)]
net/smc: add SMC-D support in af_smc

This patch ties together the previous SMC-D patches. It adds support for
SMC-D to the listen and connect functions and, thus, enables SMC-D
support in the SMC code. If a connection supports both SMC-R and SMC-D,
SMC-D is preferred.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Suggested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: add SMC-D support in data transfer
Hans Wippel [Thu, 28 Jun 2018 17:05:10 +0000 (19:05 +0200)]
net/smc: add SMC-D support in data transfer

The data transfer and CDC message headers differ in SMC-R and SMC-D.
This patch adds support for the SMC-D data transfer to the existing SMC
code. It consists of the following:

* SMC-D CDC support
* SMC-D tx support
* SMC-D rx support

The CDC header is stored at the beginning of the receive buffer. Thus, a
rx_offset variable is added for the CDC header offset within the buffer
(0 for SMC-R).

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Suggested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: add SMC-D support in CLC messages
Hans Wippel [Thu, 28 Jun 2018 17:05:09 +0000 (19:05 +0200)]
net/smc: add SMC-D support in CLC messages

There are two types of SMC: SMC-R and SMC-D. These types are signaled
within the CLC messages during the CLC handshake. This patch adds
support for and checks of the SMC type.

Also, SMC-R and SMC-D need to exchange different information during the
CLC handshake. So, this patch extends the current message formats to
support the SMC-D header fields. The Proposal message can contain both
SMC-R and SMC-D information. The Accept and Confirm messages contain
either SMC-R or SMC-D information.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Suggested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: add pnetid support for SMC-D and ISM
Hans Wippel [Thu, 28 Jun 2018 17:05:08 +0000 (19:05 +0200)]
net/smc: add pnetid support for SMC-D and ISM

SMC-D relies on PNETIDs to find usable SMC-D/ISM devices for a SMC
connection. This patch adds SMC-D/ISM support to the current PNETID
implementation.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Suggested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: add base infrastructure for SMC-D and ISM
Hans Wippel [Thu, 28 Jun 2018 17:05:07 +0000 (19:05 +0200)]
net/smc: add base infrastructure for SMC-D and ISM

SMC supports two variants: SMC-R and SMC-D. For data transport, SMC-R
uses RDMA devices, SMC-D uses so-called Internal Shared Memory (ISM)
devices. An ISM device only allows shared memory communication between
SMC instances on the same machine. For example, this allows virtual
machines on the same host to communicate via SMC without RDMA devices.

This patch adds the base infrastructure for SMC-D and ISM devices to
the existing SMC code. It contains the following:

* ISM driver interface:
  This interface allows an ISM driver to register ISM devices in SMC. In
  the process, the driver provides a set of device ops for each device.
  SMC uses these ops to execute SMC specific operations on or transfer
  data over the device.

* Core SMC-D link group, connection, and buffer support:
  Link groups, SMC connections and SMC buffers (in smc_core) are
  extended to support SMC-D.

* SMC type checks:
  Some type checks are added to prevent using SMC-R specific code for
  SMC-D and vice versa.

To actually use SMC-D, additional changes to pnetid, CLC, CDC, etc. are
required. These are added in follow-up patches.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Suggested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: optimize consumer cursor updates
Ursula Braun [Thu, 28 Jun 2018 17:05:06 +0000 (19:05 +0200)]
net/smc: optimize consumer cursor updates

The SMC protocol requires to send a separate consumer cursor update,
if it cannot be piggybacked to updates of the producer cursor.
Currently the decision to send a separate consumer cursor update
just considers the amount of data already received by the socket
program. It does not consider the amount of data already arrived, but
not yet consumed by the receiver. Basing the decision on the
difference between already confirmed and already arrived data
(instead of difference between already confirmed and already consumed
data), may lead to a somewhat earlier consumer cursor update send in
fast unidirectional traffic scenarios, and thus to better throughput.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Suggested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: add pnetid support
Ursula Braun [Thu, 28 Jun 2018 17:05:05 +0000 (19:05 +0200)]
net/smc: add pnetid support

s390 hardware supports the definition of a so-call Physical NETwork
IDentifier (short PNETID) per network device port. These PNETIDS
can be used to identify network devices that are attached to the same
physical network (broadcast domain).

On s390 try to use the PNETID of the ethernet device port used for
initial connecting, and derive the IB device port used for SMC RDMA
traffic.

On platforms without PNETID support fall back to the existing
solution of a configured pnet table.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: determine port attributes independent from pnet table
Ursula Braun [Thu, 28 Jun 2018 17:05:04 +0000 (19:05 +0200)]
net/smc: determine port attributes independent from pnet table

For SMC it is important to know the current port state of RoCE devices.
Monitoring port states has been triggered, when a RoCE device was added
to the pnet table. To support future alternatives to the pnet table the
monitoring of ports is made independent of the existence of a pnet table.
It starts once the smc_ib_device is established.

Due to this change smc_ib_remember_port_attr() is now a local function
and shuffling its location and the location of its used functions
makes any forward references obsolete.

And the duplicate SMC_MAX_PORTS definition is removed.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'Fixes-for-running-mirror-to-gretap-tests-on-veth'
David S. Miller [Sat, 30 Jun 2018 11:34:09 +0000 (20:34 +0900)]
Merge branch 'Fixes-for-running-mirror-to-gretap-tests-on-veth'

Petr Machata says:

====================
Fixes for running mirror-to-gretap tests on veth

The forwarding selftests infrastructure makes it possible to run the
individual tests on a purely software netdevices. Names of interfaces to
run the test with can be passed as command line arguments to a test.
lib.sh then creates veth pairs backing the interfaces if none exist in
the system.

However, the tests need to recognize that they might be run on a soft
device. Many mirror-to-gretap tests are buggy in this regard. This patch
set aims to fix the problems in running mirror-to-gretap tests on veth
devices.

In patch #1, a service function is split out of setup_wait().
In patch #2, installing a trap is made optional.
In patch #3, tc filters in several tests are tweaked to work with veth.
In patch #4, the logic for waiting for neighbor is fixed for veth.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: forwarding: mirror_gre_changes: Fix waiting for neighbor
Petr Machata [Thu, 28 Jun 2018 16:56:39 +0000 (18:56 +0200)]
selftests: forwarding: mirror_gre_changes: Fix waiting for neighbor

When running the test on soft devices, there's no mechanism to
gratuitously start resolving the neighbor for remote tunnel endpoint.
So instead of passively waiting, wait for the device to be up, and then
probe the neighbor with a ping.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: forwarding: Tweak tc filters for mirror-to-gretap tests
Petr Machata [Thu, 28 Jun 2018 16:56:33 +0000 (18:56 +0200)]
selftests: forwarding: Tweak tc filters for mirror-to-gretap tests

When running mirror_gre_bridge_1d_vlan tests on veth, several issues
cause spurious failures:

- vlan_ethtype should be ip, not ipv6 even in mirror-to-ip6gretap case,
  because the overlay packet is still IPv4.
- Similarly ip_proto matches the innermost IP protocol, so can't be used
  to filter out GRE packet. Drop the corresponding condition.
- Because the above fixes the filters to match in slow path as well,
  they need to be made skip_hw so as not to double-count packets.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: forwarding: lib: Avoid trapping soft devices
Petr Machata [Thu, 28 Jun 2018 16:56:28 +0000 (18:56 +0200)]
selftests: forwarding: lib: Avoid trapping soft devices

There are several cases where traffic that would normally be forwarded
in silicon needs to be observed in slow path. That's achieved by
trapping such traffic, and the functions trap_install() and
trap_uninstall() realize that. However, such treatment is obviously
wrong if the device in question is actually a soft device not backed by
an ASIC.

Therefore try to trap if possible, but fall back to inserting a continue
if not.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: forwarding: lib: Split out setup_wait_dev()
Petr Machata [Thu, 28 Jun 2018 16:56:20 +0000 (18:56 +0200)]
selftests: forwarding: lib: Split out setup_wait_dev()

Split out of setup_wait() a function setup_wait_dev() that waits for a
single device. This gives tests the opportunity to wait for a selected
device after they tinkered with its upness.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'xilinx_emaclite-coding-style'
David S. Miller [Sat, 30 Jun 2018 11:15:45 +0000 (20:15 +0900)]
Merge branch 'xilinx_emaclite-coding-style'

Radhey Shyam Pandey says:

====================
Fixes coding style in xilinx_emaclite.c

This patchset fixes checkpatch and kernel-doc warnings in
xilinx emaclite driver. No functional change.

Changes from v2:
-In 2/5 patch refactor if-else to make failure path return early.
-In 2/5 patch coalesce the format onto a single line and add the
missing space after the comma.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: emaclite: Remove unnecessary spaces
Radhey Shyam Pandey [Thu, 28 Jun 2018 13:11:50 +0000 (18:41 +0530)]
net: emaclite: Remove unnecessary spaces

This patch fixes below checkpatch checks-

CHECK: spaces preferred around that '*' (ctx:VxV)
CHECK: No space is necessary after a cast

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: emaclite: Fix block comments style
Radhey Shyam Pandey [Thu, 28 Jun 2018 13:11:49 +0000 (18:41 +0530)]
net: emaclite: Fix block comments style

This patch fixes below checkpatch warnings-

WARNING: Block comments use a trailing */ on a separate line
WARNING: Block comments use * on subsequent lines
WARNING: networking block comments don't use an empty /* line,
use /* Comment

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: emaclite: update kernel-doc comments
Radhey Shyam Pandey [Thu, 28 Jun 2018 13:11:48 +0000 (18:41 +0530)]
net: emaclite: update kernel-doc comments

This patch fixes below kernel-doc warnings:

Function parameter or member 'maxlen' not described in 'xemaclite_recv_data'
Function parameter or member 'address'not described in 'xemaclite_set_mac_address'
Excess function parameter 'addr' description in 'xemaclite_set_mac_address'
No description found for return value of 'xemaclite_interrupt'
No description found for return value of 'xemaclite_mdio_write'
Function parameter or member 'dev' not described in 'xemaclite_mdio_setup'
Excess function parameter 'ofdev' description in 'xemaclite_mdio_setup'
No description found for return value of 'xemaclite_open'
No description found for return value of 'xemaclite_close'
Excess function parameter 'match' description in 'xemaclite_of_probe'

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: emaclite: Simplify if-else statements
Radhey Shyam Pandey [Thu, 28 Jun 2018 13:11:47 +0000 (18:41 +0530)]
net: emaclite: Simplify if-else statements

Remove else as it is not required with if doing a return.
It also coalesce the format onto a single line and add the
missing space after the comma. Fixes below checkpatch warning-

WARNING: else is not generally useful after a break or return

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: emaclite: Use __func__ instead of hardcoded name
Radhey Shyam Pandey [Thu, 28 Jun 2018 13:11:46 +0000 (18:41 +0530)]
net: emaclite: Use __func__ instead of hardcoded name

Switch hardcoded function name with a reference to __func__ making
the code more maintainable. Address below checkpatch warning:

WARNING: Prefer using '"%s...", __func__' to using 'xemaclite_mdio_read',
this function's name, in a string
+               "xemaclite_mdio_read(phy_id=%i, reg=%x) == %x\n",

WARNING: Prefer using '"%s...", __func__' to using 'xemaclite_mdio_write',
this function's name, in a string
+               "xemaclite_mdio_write(phy_id=%i, reg=%x, val=%x)\n",

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'mvpp2-Add-big-endian-support'
David S. Miller [Sat, 30 Jun 2018 09:54:09 +0000 (18:54 +0900)]
Merge branch 'mvpp2-Add-big-endian-support'

Maxime Chevallier says:

====================
net: mvpp2: Add big-endian support

This series allows to use PPv2 on system built as big endian.

The first patch fixes the way we represent TX and RX descriptors, so that
they used fixed little endianness as expected by the PPv2 controller.

The second reworks the way we handle the software representation of the
Header Parser entries, so that we don't use a union of arrays.

The last two patches fixes some incorrect byte swapping logic, that wen't
un-noticed on little-endian.

This whole series doesn't fix any existing bug for little-endian systems, and
since big-endian never worked for this driver, I didn't include 'fixes' tags.

This was tested on MacchiatoBin (Armada 8040).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: mvpp2: Use htons when checking protocol info
Maxime Chevallier [Thu, 28 Jun 2018 12:42:07 +0000 (14:42 +0200)]
net: mvpp2: Use htons when checking protocol info

When checking the skb->protocol field, we have to make sure we use the
proper endianness using htons, and not swab16.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: mvpp2: prs: Drop unnecessary swab16 in vlan detection
Maxime Chevallier [Thu, 28 Jun 2018 12:42:06 +0000 (14:42 +0200)]
net: mvpp2: prs: Drop unnecessary swab16 in vlan detection

Vlan IDs must not be swapped when creating Header Parser entries. This
has no effect on little-endian systems, but is wrong for big-endian.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: mvpp2: prs: Drop unions representing TCAM and SRAM entries
Maxime Chevallier [Thu, 28 Jun 2018 12:42:05 +0000 (14:42 +0200)]
net: mvpp2: prs: Drop unions representing TCAM and SRAM entries

PPv2's Header Parser use some large TCAM and SRAM entries, that are
duplicated in software so that we can write them to hardware only when
we are done modifying them.

Currently, PPv2 uses a union containing arrays of u32 and u8 to represent
these entries, to facilitate byte per byte access. This representation is
broken when we want to support big endian, and this makes the code
confusing to read.

This patch drops the union, and simply stores the TCAM and SRAM entries
as u32 arrays, each entry corresponding to a 32-bit register.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: mvpp2: Make TX / RX descriptors little-endian
Maxime Chevallier [Thu, 28 Jun 2018 12:42:04 +0000 (14:42 +0200)]
net: mvpp2: Make TX / RX descriptors little-endian

The PPv2 controller always expect descriptors to be in little endian. We
must therefore force descriptors to use that format, and convert to the
host endianness when necessary.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotcp: add new SNMP counter for drops when try to queue in rcv queue
Yafang Shao [Thu, 28 Jun 2018 04:22:56 +0000 (00:22 -0400)]
tcp: add new SNMP counter for drops when try to queue in rcv queue

When sk_rmem_alloc is larger than the receive buffer and we can't
schedule more memory for it, the skb will be dropped.

In above situation, if this skb is put into the ofo queue,
LINUX_MIB_TCPOFODROP is incremented to track it.

While if this skb is put into the receive queue, there's no record.
So a new SNMP counter is introduced to track this behavior.

LINUX_MIB_TCPRCVQDROP:  Number of packets meant to be queued in rcv queue
but dropped because socket rcvbuf limit hit.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agobnx2x: Mark expected switch fall-throughs
Gustavo A. R. Silva [Thu, 28 Jun 2018 01:32:23 +0000 (20:32 -0500)]
bnx2x: Mark expected switch fall-throughs

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: stmmac: Add support for CBS QDISC
Jose Abreu [Wed, 27 Jun 2018 14:57:02 +0000 (15:57 +0100)]
net: stmmac: Add support for CBS QDISC

This adds support for CBS reconfiguration using the TC application.

A new callback was added to TC ops struct and another one to DMA ops to
reconfigure the channel mode.

Tested in GMAC5.10.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge tag 'mlx5e-updates-2018-06-28' of git://git.kernel.org/pub/scm/linux/kernel...
David S. Miller [Fri, 29 Jun 2018 14:54:31 +0000 (23:54 +0900)]
Merge tag 'mlx5e-updates-2018-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5e-updates-2018-06-28

mlx5e netdevice driver updates:

- Boris Pismenny added the support for UDP GSO in the first two patches.
  Impressive performance numbers are included in the commit message,
  @Line rate with ~half of the cpu utilization compared to non offload
  or no GSO at all.

- From Tariq Toukan:
  - Convert large order kzalloc allocations to kvzalloc.
  - Added performance diagnostic statistics to several places in data path.

From Saeed and Eran,
  - Update NIC HW stats on demand only, this is to eliminate the background
    thread needed to update some HW statistics in the driver cache in
    order to report error and drop counters from HW in ndo_get_stats.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'net-Geneve-options-support-for-TC-act_tunnel_key'
David S. Miller [Fri, 29 Jun 2018 14:50:27 +0000 (23:50 +0900)]
Merge branch 'net-Geneve-options-support-for-TC-act_tunnel_key'

Jakub Kicinski says:

====================
net: Geneve options support for TC act_tunnel_key

Simon & Pieter say:

This set adds Geneve Options support to the TC tunnel key action.
It provides the plumbing required to configure Geneve variable length
options.  The options can be configured in the form CLASS:TYPE:DATA,
where CLASS is represented as a 16bit hexadecimal value, TYPE as an 8bit
hexadecimal value and DATA as a variable length hexadecimal value.
Additionally multiple options may be listed using a comma delimiter.

v2:
 - fix sparse warnings in patches 3 and 4 (first one reported by
   build bot).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/sched: add tunnel option support to act_tunnel_key
Simon Horman [Wed, 27 Jun 2018 04:39:37 +0000 (21:39 -0700)]
net/sched: add tunnel option support to act_tunnel_key

Allow setting tunnel options using the act_tunnel_key action.

Options are expressed as class:type:data and multiple options
may be listed using a comma delimiter.

 # ip link add name geneve0 type geneve dstport 0 external
 # tc qdisc add dev eth0 ingress
 # tc filter add dev eth0 protocol ip parent ffff: \
     flower indev eth0 \
        ip_proto udp \
        action tunnel_key \
            set src_ip 10.0.99.192 \
            dst_ip 10.0.99.193 \
            dst_port 6081 \
            id 11 \
            geneve_opts 0102:80:00800022,0102:80:00800022 \
    action mirred egress redirect dev geneve0

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: check tunnel option type in tunnel flags
Pieter Jansen van Vuuren [Wed, 27 Jun 2018 04:39:36 +0000 (21:39 -0700)]
net: check tunnel option type in tunnel flags

Check the tunnel option type stored in tunnel flags when creating options
for tunnels. Thereby ensuring we do not set geneve, vxlan or erspan tunnel
options on interfaces that are not associated with them.

Make sure all users of the infrastructure set correct flags, for the BPF
helper we have to set all bits to keep backward compatibility.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/sched: act_tunnel_key: add extended ack support
Simon Horman [Wed, 27 Jun 2018 04:39:35 +0000 (21:39 -0700)]
net/sched: act_tunnel_key: add extended ack support

Add extended ack support for the tunnel key action by using NL_SET_ERR_MSG
during validation of user input.

Cc: Alexander Aring <aring@mojatatu.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/sched: act_tunnel_key: disambiguate metadata dst error cases
Simon Horman [Wed, 27 Jun 2018 04:39:34 +0000 (21:39 -0700)]
net/sched: act_tunnel_key: disambiguate metadata dst error cases

Metadata may be NULL for one of two reasons:
* Missing user input
* Failure to allocate the metadata dst

Disambiguate these case by returning -EINVAL for the former and -ENOMEM
for the latter rather than -EINVAL for both cases.

This is in preparation for using extended ack to provide more information
to users when parsing their input.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocxgb4: Support ethtool private flags
Arjun Vynipadath [Tue, 26 Jun 2018 11:40:50 +0000 (17:10 +0530)]
cxgb4: Support ethtool private flags

This is used to change TX workrequests, which helps in
host->vf communication.

Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocxgb4: Add support for FW_ETH_TX_PKT_VM_WR
Arjun Vynipadath [Tue, 26 Jun 2018 11:40:25 +0000 (17:10 +0530)]
cxgb4: Add support for FW_ETH_TX_PKT_VM_WR

The present TX workrequest(FW_ETH_TX_PKT_WR) cant be used for
host->vf communication, since it doesn't loopback the outgoing
packets to virtual interfaces on the same port. This can be done
using FW_ETH_TX_PKT_VM_WR.
This fix depends on ethtool_flags to determine what WR to use for
TX path. Support for setting this flags by user is added in next
commit.

Based on the original work by : Casey Leedom <leedom@chelsio.com>

Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosctp: add support for SCTP_REUSE_PORT sockopt
Xin Long [Thu, 28 Jun 2018 07:31:00 +0000 (15:31 +0800)]
sctp: add support for SCTP_REUSE_PORT sockopt

This feature is actually already supported by sk->sk_reuse which can be
set by socket level opt SO_REUSEADDR. But it's not working exactly as
RFC6458 demands in section 8.1.27, like:

  - This option only supports one-to-one style SCTP sockets
  - This socket option must not be used after calling bind()
    or sctp_bindx().

Besides, SCTP_REUSE_PORT sockopt should be provided for user's programs.
Otherwise, the programs with SCTP_REUSE_PORT from other systems will not
work in linux.

To separate it from the socket level version, this patch adds 'reuse' in
sctp_sock and it works pretty much as sk->sk_reuse, but with some extra
setup limitations that are needed when it is being enabled.

"It should be noted that the behavior of the socket-level socket option
to reuse ports and/or addresses for SCTP sockets is unspecified", so it
leaves SO_REUSEADDR as is for the compatibility.

Note that the name SCTP_REUSE_PORT is somewhat confusing, as its
functionality is nearly identical to SO_REUSEADDR, but with some
extra restrictions. Here it uses 'reuse' in sctp_sock instead of
'reuseport'. As for sk->sk_reuseport support for SCTP, it will be
added in another patch.

Thanks to Neil to make this clear.

v1->v2:
  - add sctp_sk->reuse to separate it from the socket level version.
v2->v3:
  - improve changelog according to Marcelo's suggestion.

Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: ethernet: stmmac: dwmac-rk: Add GMAC support for px30
David Wu [Thu, 28 Jun 2018 01:33:21 +0000 (09:33 +0800)]
net: ethernet: stmmac: dwmac-rk: Add GMAC support for px30

Add constants and callback functions for the dwmac on px30 Soc.
The base structure is the same, but registers and the bits in
them are moved slightly, and add the clk_mac_speed for selecting
mac speed.

Signed-off-by: David Wu <david.wu@rock-chips.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotg3: Mark expected switch fall-throughs
Gustavo A. R. Silva [Thu, 28 Jun 2018 01:45:24 +0000 (20:45 -0500)]
tg3: Mark expected switch fall-throughs

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomac80211: use BIT_ULL for NL80211_STA_INFO_* attribute types
Omer Efrat [Sun, 17 Jun 2018 10:06:25 +0000 (13:06 +0300)]
mac80211: use BIT_ULL for NL80211_STA_INFO_* attribute types

The BIT macro uses unsigned long which some architectures handle as 32 bit
and therefore might cause macro's shift to overflow when used on a value
equals or larger than 32 (NL80211_STA_INFO_RX_DURATION and afterwards).

Since 'filled' member in station_info changed to u64, BIT_ULL macro
should be used with all NL80211_STA_INFO_* attribute types instead of BIT
to prevent future possible bugs when one will use BIT macro for higher
attributes by mistake.

This commit cleans up all usages of BIT macro with the above field
in mac80211 by changing it to BIT_ULL instead.

Signed-off-by: Omer Efrat <omer.efrat@tandemg.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 years agocfg80211: use BIT_ULL for NL80211_STA_INFO_* attribute types
Omer Efrat [Sun, 17 Jun 2018 10:06:14 +0000 (13:06 +0300)]
cfg80211: use BIT_ULL for NL80211_STA_INFO_* attribute types

The BIT macro uses unsigned long which some architectures handle as 32 bit
and therefore might cause macro's shift to overflow when used on a value
equals or larger than 32 (NL80211_STA_INFO_RX_DURATION and afterwards).

Since 'filled' member in station_info changed to u64, BIT_ULL macro
should be used with all NL80211_STA_INFO_* attribute types instead of BIT
to prevent future possible bugs when one will use BIT macro for higher
attributes by mistake.

This commit cleans up all usages of BIT macro with the above field
in cfg80211 by changing it to BIT_ULL instead. In addition, there are
some places which don't use BIT nor BIT_ULL macros so align those as well.

Signed-off-by: Omer Efrat <omer.efrat@tandemg.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 years agomac80211: remove unnecessary NULL check
Johannes Berg [Fri, 29 Jun 2018 07:51:39 +0000 (09:51 +0200)]
mac80211: remove unnecessary NULL check

We don't need to check if he_oper is NULL before calling
ieee80211_verify_sta_he_mcs_support() as it - now - will
correctly check this itself. Remove the redundant check.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 years agomac80211: fix potential null pointer dereference
Gustavo A. R. Silva [Mon, 18 Jun 2018 12:41:34 +0000 (07:41 -0500)]
mac80211: fix potential null pointer dereference

he_op is being dereferenced before it is null checked, hence there
is a potential null pointer dereference.

Fix this by moving the pointer dereference after he_op has been
properly null checked.

Notice that, currently, he_op is already being null checked before
calling this function at 4593:

4593 if (!he_oper ||
4594     !ieee80211_verify_sta_he_mcs_support(sband, he_oper))
4595 ifmgd->flags |= IEEE80211_STA_DISABLE_HE;

but in case ieee80211_verify_sta_he_mcs_support is ever called
without verifying he_oper is not null, we will end up having a
null pointer dereference. So, we better don't take any chances.

Addresses-Coverity-ID: 1470068 ("Dereference before null check")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 years agocfg80211: track time using boottime
Arnd Bergmann [Mon, 18 Jun 2018 15:11:14 +0000 (17:11 +0200)]
cfg80211: track time using boottime

The cfg80211 layer uses get_seconds() to read the current time
in its supend handling. This function is deprecated because of the 32-bit
time_t overflow, and it can cause unexpected behavior when the time
changes due to settimeofday() calls or leap second updates.

In many cases, we want to use monotonic time instead, however cfg80211
explicitly tracks the time spent in suspend, so this changes the
driver over to use ktime_get_boottime_seconds(), which is slightly
slower, but not used in a fastpath here.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 years agoMerge branch 'ila-Cleanup'
David S. Miller [Fri, 29 Jun 2018 02:32:55 +0000 (11:32 +0900)]
Merge branch 'ila-Cleanup'

Tom Herbert says:

====================
ila: Cleanup

Perform some cleanup in ILA code. This includes:

- Fix rhashtable walk for cases where nl dumps are done with muliple
  function calls. Add a skip index to skip over entries in
  a node that have been previously visitied. Call rhashtable_walk_peek
  to avoid dropping items between calls to ila_nl_dump.
- Call alloc_bucket_spinlocks to create bucket locks.
- Split out module initialization and netlink definitions into
  separate files.
- Add ILA_CMD_FLUSH netlink command to clear the ILA translation table.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoila: Flush netlink command to clear xlat table
Tom Herbert [Wed, 27 Jun 2018 21:39:02 +0000 (14:39 -0700)]
ila: Flush netlink command to clear xlat table

Add ILA_CMD_FLUSH netlink command to clear the ILA translation table.

Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoila: Create main ila source file
Tom Herbert [Wed, 27 Jun 2018 21:39:01 +0000 (14:39 -0700)]
ila: Create main ila source file

Create a main ila file that contains the module initialization functions
as well as netlink definitions. Previously these were defined in
ila_xlat and ila_common. This approach allows better extensibility.

Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoila: Call library function alloc_bucket_locks
Tom Herbert [Wed, 27 Jun 2018 21:39:00 +0000 (14:39 -0700)]
ila: Call library function alloc_bucket_locks

To allocate the array of bucket locks for the hash table we now
call library function alloc_bucket_spinlocks.

Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoila: Fix use of rhashtable walk in ila_xlat.c
Tom Herbert [Wed, 27 Jun 2018 21:38:59 +0000 (14:38 -0700)]
ila: Fix use of rhashtable walk in ila_xlat.c

Perform better EAGAIN handling, handle case where ila_dump_info
fails and we missed objects in the dump, and add a skip index
to skip over ila entires in a list on a rhashtable node that have
already been visited (by a previous call to ila_nl_dump).

Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'hns3-a-few-code-improvements'
David S. Miller [Fri, 29 Jun 2018 02:06:35 +0000 (11:06 +0900)]
Merge branch 'hns3-a-few-code-improvements'

Peng Li says:

====================
net: hns3: a few code improvements

This patchset fixes a few code stylistic issues from
concentrated review, no functional changes introduced.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: hns3: use lower_32_bits and upper_32_bits
Huazhong Tan [Thu, 28 Jun 2018 04:12:29 +0000 (12:12 +0800)]
net: hns3: use lower_32_bits and upper_32_bits

MACRO lower_32_bits and upper_32_bits can help to get bits 0-31
and bits 32-63 of a number, so just use it.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: hns3: remove back in struct hclge_hw
Huazhong Tan [Thu, 28 Jun 2018 04:12:28 +0000 (12:12 +0800)]
net: hns3: remove back in struct hclge_hw

hclge_hw is embedded in hclge_dev, so use container_of instead of
back to get hclge_dev.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: hns3: remove the Redundant put_vector in hns3_client_uninit
Peng Li [Thu, 28 Jun 2018 04:12:27 +0000 (12:12 +0800)]
net: hns3: remove the Redundant put_vector in hns3_client_uninit

The interface h->ae_algo->ops->put_vector is called in both
hns3_nic_dealloc_vector_data and hns3_nic_uninit_vector_data in
hns3_client_uninit, this will cause vector freed twice.
This patch remove the Redundant put_vector to make vector freed
only once.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: hns3: print the ret value in error information
Peng Li [Thu, 28 Jun 2018 04:12:26 +0000 (12:12 +0800)]
net: hns3: print the ret value in error information

Print the ret value in error information can help find the reason.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: hns3: extraction an interface for state init|uninit
Peng Li [Thu, 28 Jun 2018 04:12:25 +0000 (12:12 +0800)]
net: hns3: extraction an interface for state init|uninit

Extraction an interface for state init|uninit to make the code
easier to read.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: hns3: remove unused head file in hnae3.c
Peng Li [Thu, 28 Jun 2018 04:12:24 +0000 (12:12 +0800)]
net: hns3: remove unused head file in hnae3.c

linux/slab.h is not used in hnae3.h, this patch removes it.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: hns3: add unlikely for error check
Peng Li [Thu, 28 Jun 2018 04:12:23 +0000 (12:12 +0800)]
net: hns3: add unlikely for error check

The first bd of a packet is invalid and invalid ring head for tx
IRQ is not offen, they may occur when there is error,
Add unlikely for error check branch is better for performance.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: hns3: add l4_type check for both ipv4 and ipv6
Peng Li [Thu, 28 Jun 2018 04:12:22 +0000 (12:12 +0800)]
net: hns3: add l4_type check for both ipv4 and ipv6

HW supports UDP, TCP and SCTP packets checksum for both ipv4 and
ipv6,  but do not support other type packets checksum for ipv4 or
ipv6.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: hns3: add vector status check before free vector
Peng Li [Thu, 28 Jun 2018 04:12:21 +0000 (12:12 +0800)]
net: hns3: add vector status check before free vector

If the hdev->vector_status[vector_id] is already HCLGE_INVALID_VPORT,
should log the error and return.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: hns3: rename the interface for init_client_instance and uninit_client_instance
Peng Li [Thu, 28 Jun 2018 04:12:20 +0000 (12:12 +0800)]
net: hns3: rename the interface for init_client_instance and uninit_client_instance

The interface init_client_instance and uninit_client_instance
do not register anything, only initialize the client instance.
This patch rename the related interface to make the function
name to indicate the purpose.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: hns3: remove hclge_get_vector_index from hclge_bind_ring_with_vector
Peng Li [Thu, 28 Jun 2018 04:12:19 +0000 (12:12 +0800)]
net: hns3: remove hclge_get_vector_index from hclge_bind_ring_with_vector

In hclge_unmap_ring_frm_vector, there are 2 steps:
step 1: get vector index.
step 2 unbind ring with vector.

But it gets vector id again in step 2 interface. This patch
removes hclge_get_vector_index from hclge_bind_ring_with_vector,
and make the step the same with hns3 PF driver.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>