David S. Miller [Tue, 4 Jun 2019 21:49:38 +0000 (14:49 -0700)]
Merge branch 'bond-mpls'
Ariel Levkovich says:
====================
Support MPLS features in bonding and vlan net devices
Netdevice HW MPLS features are not passed from device driver's netdevice to
upper netdevice, specifically VLAN and bonding netdevice which are created
by the kernel when needed.
This prevents enablement and usage of HW offloads, such as TSO and checksumming
for MPLS tagged traffic when running via VLAN or bonding interface.
The patches introduce changes to the initialization steps of the VLAN and bonding
netdevices to inherit the MPLS features from lower netdevices to allow the HW
offloads.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Levkovich [Mon, 3 Jun 2019 22:36:47 +0000 (22:36 +0000)]
net: vlan: Inherit MPLS features from parent device
During the creation of the VLAN interface net device,
the various device features and offloads are being set based
on the parent device's features.
The code initiates the basic, vlan and encapsulation features
but doesn't address the MPLS features set and they remain blank.
As a result, all device offloads that have significant performance
effect are disabled for MPLS traffic going via this VLAN device such
as checksumming and TSO.
This patch makes sure that MPLS features are also set for the
VLAN device based on the parent which will allow HW offloads of
checksumming and TSO to be performed on MPLS tagged packets.
Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Levkovich [Mon, 3 Jun 2019 22:36:46 +0000 (22:36 +0000)]
net: bonding: Inherit MPLS features from slave devices
When setting the bonding interface net device features,
the kernel code doesn't address the slaves' MPLS features
and doesn't inherit them.
Therefore, HW offloads that enhance performance such as
checksumming and TSO are disabled for MPLS tagged traffic
flowing via the bonding interface.
The patch add the inheritance of the MPLS features from the
slave devices with a similar logic to setting the bonding device's
VLAN and encapsulation features.
CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 4 Jun 2019 21:33:50 +0000 (14:33 -0700)]
Merge branch 'net-tls-small-general-improvements'
Jakub Kicinski says:
====================
net/tls: small general improvements
This series cleans up and improves the tls code, mostly the offload
parts.
First a slight performance optimization - avoiding unnecessary re-
-encryption of records in patch 1. Next patch 2 makes the code
more resilient by checking for errors in skb_copy_bits(). Next
commit removes a warning which can be triggered in normal operation,
(especially for devices explicitly making use of the fallback path).
Next two paths change the condition checking around the call to
tls_device_decrypted() to make it easier to extend. Remaining
commits are centered around reorganizing struct tls_context for
better cache utilization.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 3 Jun 2019 22:17:05 +0000 (15:17 -0700)]
net/tls: don't pass version to tls_advance_record_sn()
All callers pass prot->version as the last parameter
of tls_advance_record_sn(), yet tls_advance_record_sn()
itself needs a pointer to prot. Pass prot from callers.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 3 Jun 2019 22:17:04 +0000 (15:17 -0700)]
net/tls: reorganize struct tls_context
struct tls_context is slightly badly laid out. If we reorder things
right we can save 16 bytes (320 -> 304) but also make all fast path
data fit into two cache lines (one read only and one read/write,
down from four cache lines).
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 3 Jun 2019 22:17:03 +0000 (15:17 -0700)]
net/tls: use version from prot
ctx->prot holds the same information as per-direction contexts.
Almost all code gets TLS version from this structure, convert
the last two stragglers, this way we can improve the cache
utilization by moving the per-direction data into cold cache lines.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 3 Jun 2019 22:17:02 +0000 (15:17 -0700)]
net/tls: don't re-check msg decrypted status in tls_device_decrypted()
tls_device_decrypted() is only called from decrypt_skb_update(),
when ctx->decrypted == false, there is no need to re-check the bit.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 3 Jun 2019 22:17:01 +0000 (15:17 -0700)]
net/tls: don't look for decrypted frames on non-offloaded sockets
If the RX config of a TLS socket is SW, there is no point iterating
over the fragments and checking if frame is decrypted. It will
always be fully encrypted. Note that in fully encrypted case
the function doesn't actually touch any offload-related state,
so it's safe to call for TLS_SW, today. Soon we will introduce
code which can only be called for offloaded contexts.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 3 Jun 2019 22:17:00 +0000 (15:17 -0700)]
net/tls: remove false positive warning
It's possible that TCP stack will decide to retransmit a packet
right when that packet's data gets acked, especially in presence
of packet reordering. This means that packets may be in flight,
even though tls_device code has already freed their record state.
Make fill_sg_in() and in turn tls_sw_fallback() not generate a
warning in that case, and quietly proceed to drop such frames.
Make the exit path from tls_sw_fallback() drop monitor friendly,
for users to be able to troubleshoot dropped retransmissions.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 3 Jun 2019 22:16:59 +0000 (15:16 -0700)]
net/tls: check return values from skb_copy_bits() and skb_store_bits()
In light of recent bugs, we should make a better effort of
checking return values. In theory none of the functions should
fail today.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 3 Jun 2019 22:16:58 +0000 (15:16 -0700)]
net/tls: fully initialize the msg wrapper skb
If strparser gets cornered into starting a new message from
an sk_buff which already has frags, it will allocate a new
skb to become the "wrapper" around the fragments of the
message.
This new skb does not inherit any metadata fields. In case
of TLS offload this may lead to unnecessarily re-encrypting
the message, as skb->decrypted is not set for the wrapper skb.
Try to be conservative and copy all fields of old skb
strparser's user may reasonably need.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: mscc: ocelot: Fix some struct initializations
Clang warns:
drivers/net/ethernet/mscc/ocelot_ace.c:335:37: warning: suggest braces
around initialization of subobject [-Wmissing-braces]
struct ocelot_vcap_u64 payload = { 0 };
^
{}
drivers/net/ethernet/mscc/ocelot_ace.c:336:28: warning: suggest braces
around initialization of subobject [-Wmissing-braces]
struct vcap_data data = { 0 };
^
{}
drivers/net/ethernet/mscc/ocelot_ace.c:683:37: warning: suggest braces
around initialization of subobject [-Wmissing-braces]
struct ocelot_ace_rule del_ace = { 0 };
^
{}
drivers/net/ethernet/mscc/ocelot_ace.c:743:28: warning: suggest braces
around initialization of subobject [-Wmissing-braces]
struct vcap_data data = { 0 };
^
{}
4 warnings generated.
One way to fix these warnings is to add additional braces like Clang
suggests; however, there has been a bit of push back from some
maintainers[1][2], who just prefer memset as it is unambiguous, doesn't
depend on a particular compiler version[3], and properly initializes all
subobjects. Do that here so there are no more warnings.
Fixes: b596229448dd ("net: mscc: ocelot: Add support for tcam") Link: https://github.com/ClangBuiltLinux/linux/issues/505 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This occurs because we hold RTNL mutex, but no rcu read lock.
The second call site holds both, so just switch to the _rtnl variant.
Reported-by: syzbot+bad6e32808a3a97b1515@syzkaller.appspotmail.com Fixes: 2638eb8b50cf ("net: ipv4: provide __rcu annotation for ifa_list") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
expose flash update status to user
When user is flashing device using devlink, he currenly does not see any
information about what is going on, percentages, etc.
Drivers, for example mlxsw and mlx5, have notion about the progress
and what is happening. This patchset exposes this progress
information to userspace.
Example output for existing flash command:
$ devlink dev flash pci/0000:01:00.0 file firmware.bin
Preparing to flash
Flashing 100%
Flashing done
See this console recording which shows flashing FW on a Mellanox
Spectrum device:
https://asciinema.org/a/247926
Please see individual patches for changelog.
v2->v3 only adds tags and the last selftest patch
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Tue, 4 Jun 2019 13:40:43 +0000 (15:40 +0200)]
netdevsim: implement fake flash updating with notifications
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Tue, 4 Jun 2019 13:40:40 +0000 (15:40 +0200)]
devlink: allow driver to update progress of flash update
Introduce a function to be called from drivers during flash. It sends
notification to userspace about flash update progress.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Sun, 2 Jun 2019 21:16:01 +0000 (00:16 +0300)]
net: dsa: sja1105: Hide the dsa_8021q VLANs from the bridge fdb command
TX VLANs and RX VLANs are an internal implementation detail of DSA for
frame tagging. They work by installing special VLANs on switch ports in
the operating modes where no behavior change w.r.t. VLANs can be
observed by the user.
Therefore it makes sense to hide these VLANs in the 'bridge fdb'
command, as well as translate the pvid into the RX VID and TX VID on
'bridge fdb add' and 'bridge fdb del' commands.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Sun, 2 Jun 2019 21:15:45 +0000 (00:15 +0300)]
net: dsa: sja1105: Add FDB operations for P/Q/R/S series
This adds support for manipulating the L2 forwarding database (dump,
add, delete) for the second generation of NXP SJA1105 switches.
At the moment only FDB entries installed statically through 'bridge fdb'
are visible in the dump callback - the dynamically learned ones are
still under investigation.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Sun, 2 Jun 2019 21:11:58 +0000 (00:11 +0300)]
net: dsa: sja1105: Add P/Q/R/S support for dynamic L2 lookup operations
These are needed in order to implement the switchdev FDB callbacks.
Compared to the E/T generation, not only the ABI (bit offsets) is
different, but also the introduction of the HOSTCMD field which permits
O(1) TCAM search for an FDB entry. Make use of the newly introduce
OP_SEARCH to permit that. It will be used while adding and deleting an
FDB entry (to see whether it exists or not).
Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Sun, 2 Jun 2019 21:11:57 +0000 (00:11 +0300)]
net: dsa: sja1105: Make room for P/Q/R/S FDB operations
The DSA callbacks were written with the E/T (first generation) in mind,
which is quite different.
For P/Q/R/S completely new implementations need to be provided, which
are held as function pointers in the priv->info structure. We are
taking a slightly roundabout way for this (a function from
sja1105_main.c reads a structure defined in sja1105_spi.c that
points to a function defined in sja1105_main.c), but it is what it is.
The FDB dump callback works for both families, hence no function pointer
for that.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Sun, 2 Jun 2019 21:11:56 +0000 (00:11 +0300)]
net: dsa: sja1105: Plug in support for TCAM searches via the dynamic interface
Only a single dynamic configuration table of the SJA1105 P/Q/R/S
supports this operation: the FDB.
To keep the existing structure in place (sja1105_dynamic_config_read and
sja1105_dynamic_config_write) and not introduce any new function, a
convention is made for sja1105_dynamic_config_read that a negative index
argument denotes a search for the entry provided as argument.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This appends to the L2 Forwarding and L2 Forwarding Parameters tables
(originally added for first-generation switches) the bits that are new
in the second generation.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Sun, 2 Jun 2019 21:11:54 +0000 (00:11 +0300)]
net: dsa: sja1105: Fix bit offsets of index field from L2 lookup entries
This was inadvertently copied from the SJA1105 E/T structure and not
tested. Cross-checking with the P/Q/R/S documentation (UM11040) makes
it immediately obvious what the correct bit offsets for this field are.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Sun, 2 Jun 2019 21:11:53 +0000 (00:11 +0300)]
net: dsa: sja1105: Shim declaration of struct sja1105_dyn_cmd
This structure is merely an implementation detail and should be hidden
from the sja1105_dynamic_config.h header, which provides to the rest of
the driver an abstract access to the dynamic configuration interface of
the switch.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Mon, 3 Jun 2019 19:25:43 +0000 (21:25 +0200)]
r8169: make rtl_fw_format_ok and rtl_fw_data_ok more independent
In preparation of factoring out the firmware handling code avoid any
usage of struct rtl8169_private internals. As part of it we can inline
rtl_check_firmware.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Mon, 3 Jun 2019 19:23:43 +0000 (21:23 +0200)]
r8169: add enum rtl_fw_opcode
Replace the firmware opcode defines with a proper enum. The BUG()
in rtl_fw_write_firmware() can be removed because the call to
rtl_fw_data_ok() ensures all opcodes are valid.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 3 Jun 2019 22:32:50 +0000 (15:32 -0700)]
Merge branch 'hns3-next'
Huazhong Tan says:
====================
code optimizations & bugfixes for HNS3 driver
This patch-set includes code optimizations and bugfixes for the HNS3
ethernet controller driver.
[patch 1/10] removes the redundant core reset type
[patch 2/10 - 3/10] fixes two VLAN related issues
[patch 4/10] fixes a TM issue
[patch 5/10 - 10/10] includes some patches related to RAS & MSI-X error
Change log:
V1->V2: removes two patches which needs to change HNS's infiniband
driver as well, they will be upstreamed later with the
infiniband's one.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Weihang Li [Mon, 3 Jun 2019 02:09:22 +0000 (10:09 +0800)]
net: hns3: delay and separate enabling of NIC and ROCE HW errors
All RAS and MSI-X should be enabled just in the final stage of HNS3
initialization. It means that they should be enabled in
hclge_init_xxx_client_instance instead of hclge_ae_dev(). Especially
MSI-X, if it is enabled before opening vector0 IRQ, there are some
chances that a MSI-X error will cause failure on initialization of
NIC client instane. So this patch delays enabling of HW errors.
Otherwise, we also separate enabling of ROCE RAS from NIC, because
it's not reasonable to enable ROCE RAS if we even don't have a ROCE
driver.
Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Weihang Li [Mon, 3 Jun 2019 02:09:21 +0000 (10:09 +0800)]
net: hns3: add opcode about query and clear RAS & MSI-X to special opcode
There are four commands being used to query and clear RAS and MSI-X
interrupts status. They should be contained in array of special opcodes
because these commands have several descriptors, and we need to judge
return value in the first descriptor rather than the last one as other
opcodes. In addition, we shouldn't set the NEXT_FLAG of first descriptor.
This patch fixes above issues.
Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Weihang Li [Mon, 3 Jun 2019 02:09:20 +0000 (10:09 +0800)]
net: hns3: remove setting bit of reset_requests when handling mac tunnel interrupts
We shouldn't set HNAE3_NONE_RESET bit of the variable that represents a
reset request during handling of MSI-X errors, or may cause issue when
trigger reset.
Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Weihang Li [Mon, 3 Jun 2019 02:09:19 +0000 (10:09 +0800)]
net: hns3: add handling of two bits in MAC tunnel interrupts
LINK_UP and LINK_DOWN are two bits of MAC tunnel interrupts, but previous
HNS3 driver didn't handle them. If they were enabled, value of these two
bits will change during link down and link up, which will cause HNS3
driver keep receiving IRQ but can't handle them.
This patch adds handling of these two bits of interrupts, we will record
and clear them as what we do to other MAC tunnel interrupts.
Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Weihang Li [Mon, 3 Jun 2019 02:09:18 +0000 (10:09 +0800)]
net: hns3: set ops to null when unregister ad_dev
The hclge/hclgevf and hns3 module can be unloaded independently,
when hclge/hclgevf unloaded firstly, the ops of ae_dev should
be set to NULL, otherwise it will cause an use-after-free problem.
Fixes: 38caee9d3ee8 ("net: hns3: Add support of the HNAE3 framework") Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Weihang Li [Mon, 3 Jun 2019 02:09:17 +0000 (10:09 +0800)]
net: hns3: add a check to pointer in error_detected and slot_reset
If we add a VF without loading hclgevf.ko and then there is a RAS error
occurs, PCIe AER will call error_detected and slot_reset of all functions,
and will get a NULL pointer when we check ad_dev->ops->handle_hw_ras_error.
This will cause a call trace and failures on handling of follow-up RAS
errors.
This patch check ae_dev and ad_dev->ops at first to solve above issues.
Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Mon, 3 Jun 2019 02:09:16 +0000 (10:09 +0800)]
net: hns3: set the port shaper according to MAC speed
This patch sets the port shaper according to the MAC speed as
suggested by hardware user manual.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jian Shen [Mon, 3 Jun 2019 02:09:15 +0000 (10:09 +0800)]
net: hns3: fix VLAN filter restore issue after reset
In orginal codes, the driver only restore VLAN filter entries
for PF after reset, the VLAN entries of VF will lose in this
case.
This patch fixes it by recording VLAN IDs for each function
when add VLAN, and restore the VLAN IDs after reset.
Fixes: 681ec3999b3d ("net: hns3: fix for vlan table lost problem when resetting") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jian Shen [Mon, 3 Jun 2019 02:09:14 +0000 (10:09 +0800)]
net: hns3: don't configure new VLAN ID into VF VLAN table when it's full
VF VLAN table can only support no more than 256 VLANs. When user
adds too many VLANs, the VF VLAN table will be full, and firmware
will close the VF VLAN table for the function. When VF VLAN table
is full, and user keeps adding new VLANs, it's unnecessary to
configure the VF VLAN table, because it will always fail, and print
warning message. The worst case is adding 4K VLANs, and doing reset,
it will take much time to restore these VLANs, which may cause VF
reset fail by timeout.
Fixes: 6c251711b37f ("net: hns3: Disable vf vlan filter when vf vlan table is full") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Mon, 3 Jun 2019 02:09:13 +0000 (10:09 +0800)]
net: hns3: remove redundant core reset
Since core reset is similar to the global reset, so this
patch removes it and uses global reset to replace it.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 2 Jun 2019 18:24:18 +0000 (11:24 -0700)]
net: fix use-after-free in kfree_skb_list
syzbot reported nasty use-after-free [1]
Lets remove frag_list field from structs ip_fraglist_iter
and ip6_fraglist_iter. This seens not needed anyway.
[1] :
BUG: KASAN: use-after-free in kfree_skb_list+0x5d/0x60 net/core/skbuff.c:706
Read of size 8 at addr ffff888085a3cbc0 by task syz-executor303/8947
Memory state around the buggy address: ffff888085a3ca80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff888085a3cb00: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc
>ffff888085a3cb80: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
^ ffff888085a3cc00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888085a3cc80: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc
Fixes: 0feca6190f88 ("net: ipv6: add skbuff fraglist splitter") Fixes: c8b17be0b7a4 ("net: ipv4: add skbuff fraglist splitter") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
YueHaibing [Sat, 1 Jun 2019 08:06:05 +0000 (16:06 +0800)]
qed: Fix build error without CONFIG_DEVLINK
Fix gcc build error while CONFIG_DEVLINK is not set
drivers/net/ethernet/qlogic/qed/qed_main.o: In function `qed_remove':
qed_main.c:(.text+0x1eb4): undefined reference to `devlink_unregister'
Select DEVLINK to fix this.
Reported-by: Hulk Robot <hulkci@huawei.com> Fixes: 24e04879abdd ("qed: Add qed devlink parameters table") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 3 Jun 2019 22:00:00 +0000 (15:00 -0700)]
Merge branch 'Add-MT7629-ethernet-support'
Sean Wang says:
====================
Add MT7629 ethernet support
MT7629 inlcudes two sets of SGMIIs used for external switch or PHY, and embedded
switch (ESW) via GDM1, GePHY via GMAC2, so add several patches in the series to
make the code base common with the old SoCs.
The patch 1, 3 and 6, adds extension for SGMII to have the hardware configured
for 1G, 2.5G and AN to fit the capability of the target PHY. In patch 6 could be
an example showing how to use these configurations for underlying PHY speed to
match up the link speed of the target PHY.
The patch 4 is used for automatically configured the hardware path from GMACx to
the target PHY by the description in deviceetree topology to determine the
proper value for the corresponding MUX.
The patch 2 and 5 is for the update for MT7629 including dt-binding document and
its driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sean Wang [Sat, 1 Jun 2019 00:03:13 +0000 (08:03 +0800)]
net: ethernet: mediatek: Integrate hardware path from GMAC to PHY variants
All path route on various SoCs all would be managed in common function
mtk_setup_hw_path that is determined by the both applied devicetree
regarding the path between GMAC and the target PHY or switch by the
capability of target SoC in the runtime.
Signed-off-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sean Wang [Sat, 1 Jun 2019 00:03:12 +0000 (08:03 +0800)]
net: ethernet: mediatek: Extend SGMII related functions
Add SGMII related logic into a separate file, and also provides options for
forcing 1G, 2.5, AN mode for the target PHY, that can be determined from
SGMII node in DTS.
Signed-off-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 3 Jun 2019 20:42:56 +0000 (13:42 -0700)]
Merge tag 'mlx5-updates-2019-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2019-05-31
This series provides some updates to mlx5 core and netdevice driver.
1) use __netdev_tx_sent_queue() to improve performance under GSO workload
2) Allow matching only enc_key_id/enc_dst_port for decapsulation action
3) Geneve support:
This patchset adds support for GENEVE tunnel encap/decap flows offload:
encapsulating layer 2 Ethernet frames within layer 4 UDP datagrams.
The driver supports 6081 destination UDP port number, which is the
default IANA-assigned port.
Encap:
ConnectX-5 inserts the header (w/ or w/o Geneve TLV options) that is
provided by the mlx5 driver to the outgoing packet.
Decap:
Geneve header is matched and the packet is decapsulated.
Notes about decap flows with Geneve TLV Options:
- Support offloading of 32-bit options data only
- At any given time, only one combination of class/type parameters
can be offloaded, but the same class/type combination can have
many different flows offloaded with different 32-bit option data
- Options with value of 0 can't be offloaded
Managing Geneve TLV options:
Matching (on receive) is done by ConnectX-5 flex parser.
Geneve TLV options are managed using General Object of type
“Geneve TLV Options”.
When the first flow with a certain class/type values is requested
to be offloaded, the driver creates a FW object with FW command
(Geneve TLV Options general object) and starts counting the number
of flows using this object.
During this time, any request with a different class/type values
will fail to be offloaded.
Once the refcount reaches 0, the driver destroys the TLV options
general object, and can now offload a flow with any class/type parameters.
Geneve TLV Options object is added to core device.
It is currently used to manage Geneve TLV options general
object allocation in FW and its reference counting only.
In the future it will also be used for managing geneve ports
by registering callbacks for ndo_udp_tunnel_add/del.
TC tunnel code refactoring:
As a preparation for Geneve code, the TC tunnel code in mlx5
was rearranged in a modular way, so that it would be easier
to add future tunnels:
- Defined tc tunnel object with the fields and callbacks that
any tunnel must implement.
- Define tc UDP tunnel object for UDP tunnels, such as VXLAN
- Move each tunnel code (GRE, VXLAN) to its own separate file
- Rewrite tc tunnel implementation in a general way – using
only the objects and their callbacks.
4) Termination tables:
Actions in tables set with the termination flag are guaranteed to terminate
the action list. Thus, potential looping functionality (e.g. haripin) can safely be
executed without potential loops.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 3 Jun 2019 20:30:38 +0000 (13:30 -0700)]
Merge branch 'ena-next'
Sameeh Jubran says:
====================
Extending the ena driver to support new features and enhance performance
This patchset introduces the following:
* add support for changing the inline header size (max_header_size) for applications
with overlay and nested headers
* enable automatic fallback to polling mode for admin queue when interrupt is not
available or missed
* add good checksum counter for Rx ethtool statistics
* update ena.txt
* some minor code clean-up
* some performance enhancements with doorbell calculations
Differences from V1:
* net: ena: add handling of llq max tx burst size (1/11):
* fixed christmas tree issue
* net: ena: ethtool: add extra properties retrieval via get_priv_flags (2/11):
* replaced snprintf with strlcpy
* dropped confusing error message
* added more details to the commit message
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sameeh Jubran [Mon, 3 Jun 2019 14:43:28 +0000 (17:43 +0300)]
net: ena: add good checksum counter
Add a new statistics to ETHTOOL to specify if the device calculated
and validated the Rx csum.
Signed-off-by: Evgeny Shmeilin <evgeny@annapurnaLabs.com> Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sameeh Jubran [Mon, 3 Jun 2019 14:43:27 +0000 (17:43 +0300)]
net: ena: optimise calculations for CQ doorbell
This patch initially checks if CQ doorbell
is needed before proceeding with the calculations.
Signed-off-by: Igor Chauskin <igorch@amazon.com> Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sameeh Jubran [Mon, 3 Jun 2019 14:43:26 +0000 (17:43 +0300)]
net: ena: add support for changing max_header_size in LLQ mode
Up until now the driver always used a single setting for the sizes
of the different parts of the llq entry - 128 for entry size, 2 for
descriptors before header and 96 for maximum header size.
The current code makes sure that the parts of the llq entry are
compatible with each other and with the initial llq entry size given
by the device.
This commit changes this code to support any llq entry size
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sameeh Jubran [Mon, 3 Jun 2019 14:43:25 +0000 (17:43 +0300)]
net: ena: allow automatic fallback to polling mode
Enable fallback to polling mode for Admin queue
when identified a command response arrival
without an accompanying MSI-X interrupt
Signed-off-by: Igor Chauskin <igorch@amazon.com> Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sameeh Jubran [Mon, 3 Jun 2019 14:43:23 +0000 (17:43 +0300)]
net: ena: add newline at the end of pr_err prints
Some pr_err prints lacked '\n' in the end. Added where missing.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sameeh Jubran [Mon, 3 Jun 2019 14:43:22 +0000 (17:43 +0300)]
net: ena: arrange ena_probe() function variables in reverse christmas tree
Reverse christmas tree arrangement is when strings are written from longer
to shorter with each line. Most of our functions are abiding this
arrangement but this function does not.
In this commit we arrange the variables of ena_probe() in reverse christmas
tree.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sameeh Jubran [Mon, 3 Jun 2019 14:43:21 +0000 (17:43 +0300)]
net: ena: replace free_tx/rx_ids union with single free_ids field in ena_ring
struct ena_ring holds a union of free_rx_ids and free_tx_ids.
Both of the above fields mean the exact same thing and are used
exactly the same way.
Furthermore, these fields are always used with a prefix of the
type of ring. So for tx it will be tx_ring->free_tx_ids, and for
rx it will be rx_ring->free_rx_ids, which shows how redundant the
"_tx" and "_rx" parts are.
Furthermore still, this may lead to confusing code like where
tx_ring->free_rx_ids which works correctly but looks like a mess.
This commit removes the aforementioned redundancy by replacing the
free_rx/tx_ids union with a single free_ids field.
It also changes a single goto label name from err_free_tx_ids: to
err_tx_free_ids: for consistency with the above new notation.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: ena: ethtool: add extra properties retrieval via get_priv_flags
This commit adds a mechanism for exposing different device
properties via ethtool's priv_flags. The strings are provided
by the device and copied to user space through the driver.
In this commit we:
Add commands, structs and defines necessary for handling
extra properties
Add functions for:
Allocation/destruction of a buffer for extra properties strings.
Retreival of extra properties strings and flags from the network device.
Handle the allocation of a buffer for extra properties strings.
* Initialize buffer with extra properties strings from the
network device at driver startup.
Use ethtool's get_priv_flags to expose extra properties of
the ENA device
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sameeh Jubran [Mon, 3 Jun 2019 14:43:19 +0000 (17:43 +0300)]
net: ena: add handling of llq max tx burst size
There is a maximum TX burst size that the ENA device can handle.
It is exposed by the device to the driver and the driver
needs to comply with it to avoid bugs.
In this commit we:
1. Add ena_com_is_doorbell_needed(), which calculates the number of
llq entries that will be used to hold a packet, and will return
true if they exceed the number of allowed entries in a burst.
If the function returns true, a doorbell needs to be invoked
to send this packet in the next burst.
2. Follow the available entries in the current burst:
- Every doorbell a new burst begins
- With each write of an llq entry, the available entries in the
current burst are decreased by 1.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: dsa: mv88e6xxx: make mv88e6xxx_g1_stats_wait static
mv88e6xxx_g1_stats_wait has no users outside global1.c, so make it
static.
Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: dsa: mv88e6xxx: fix comments and macro names in mv88e6390_g1_mgmt_rsvd2cpu
The macros have an extraneous '800' (after 0180C2 there should be just
six nibbles, with X representing one), while the comments have
interchanged c2 and 80 and an extra :00.
Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
r8169: replace several function pointers with direct calls
This series removes most function pointers from struct rtl8169_private
and uses direct calls instead. This simplifies the code and avoids
the penalty of indirect calls in times of retpoline.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
pci_device_to_OF_node(to_pci_dev(dev)) is the same as dev->of_node,
so we can simplify the code. In addition add an empty line before
the return statement.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 3 Jun 2019 01:08:47 +0000 (18:08 -0700)]
Merge branch 'ifa_list-RCU'
Florian Westphal says:
====================
net: add rcu annotations for ifa_list
v3: fix typo in patch1 commit message
All other patches are unchanged.
v2: remove ifa_list iteration in afs instead of conversion
Eric Dumazet reported following problem:
It looks that unless RTNL is held, accessing ifa_list needs proper RCU
protection. indev->ifa_list can be changed under us by another cpu
(which owns RTNL) [..]
A proper rcu_dereference() with an happy sparse support would require
adding __rcu attribute.
This patch series does that: add __rcu to the ifa_list pointers.
That makes sparse complain, so the series also adds the required
rcu_assign_pointer/dereference helpers where needed.
All patches except the last one are preparation work.
Two new macros are introduced for in_ifaddr walks.
Last patch adds the __rcu annotations and the assign_pointer/dereference
helper calls.
This patch is a bit large, but I found no better way -- other
approaches (annotate-first or add helpers-first) all result in
mid-series sparse warnings.
This series is submitted vs. net-next rather than net for several
reasons:
1. Its (mostly) compile-tested only
2. 3rd patch changes behaviour wrt. secondary addresses
(see changelog)
3. The problem exists for a very long time (2004), so it doesn't
seem to be urgent to fix this -- rcu use to free ifa_list
predates the git era.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Fri, 31 May 2019 16:27:05 +0000 (18:27 +0200)]
devinet: use in_dev_for_each_ifa_rcu in more places
This also replaces spots that used for_primary_ifa().
for_primary_ifa() aborts the loop on the first secondary address seen.
Replace it with either the rcu or rtnl variant of in_dev_for_each_ifa(),
but two places will now also consider secondary addresses too:
inet_addr_onlink() and inet_ifa_byprefix().
I do not understand why they should ignore secondary addresses.
Why would a secondary address not be considered 'on link'?
When matching a prefix, why ignore a matching secondary address?
Other places get converted as well, but gain "->flags & SECONDARY" check.
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Fri, 31 May 2019 16:27:03 +0000 (18:27 +0200)]
afs: do not send list of client addresses
David Howells says:
I'm told that there's not really any point populating the list.
Current OpenAFS ignores it, as does AuriStor - and IBM AFS 3.6 will
do the right thing.
The list is actually useless as it's the client's view of the world,
not the servers, so if there's any NAT in the way its contents are
invalid. Further, it doesn't support IPv6 addresses.
On that basis, feel free to make it an empty list and remove all the
interface enumeration.
V1 of this patch reworked the function to use a new helper for the
ifa_list iteration to avoid sparse warnings once the proper __rcu
annotations get added in struct in_device later.
But, in light of the above, just remove afs_get_ipv4_interfaces.
Compile tested only.
Cc: David Howells <dhowells@redhat.com> Cc: linux-afs@lists.infradead.org Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Fri, 31 May 2019 13:27:38 +0000 (14:27 +0100)]
qed: remove redundant assignment to rc
The variable rc is assigned with a value that is never read and
it is re-assigned a new value later on. The assignment is redundant
and can be removed.
Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
When isdn4linux came up in the context of another patch series, I
remembered that we had discussed removing it a while ago.
It turns out that the suggestion from Karsten Keil wa to remove I4L
in 2018 after the last public ISDN networks are shut down. This has
happened now (with a very small number of exceptions), so I guess it's
time to try again.
We currently have three ISDN stacks in the kernel: the original
isdn4linux (with the hisax driver), the newer CAPI (with four drivers),
and finally the mISDN stack (supporting roughly the same hardware as
hisax).
As far as I can tell, anyone using ISDN with mainline kernel drivers in
the past few years uses mISDN, and this is typically used for voice-only
PBX installations that don't require a public network.
The older stacks support additional features for data networks, but those
typically make no sense any more if there is no network to connect to.
My proposal for this time is to kill off isdn4linux entirely, as it seems
to have been unusable for quite a while. This code has been abandoned
for many years and it does cause problems for treewide maintenance as
it tends to do everything that we try to stop doing.
Birger Harzenetter mentioned that is is still using i4l in order to
make use of the 'divert' feature that is not part of mISDN, but has
otherwise moved on to mISDN for normal operation, like apparently
everyone else.
CAPI in turn is not quite as obsolete, but two of the drivers (avm
and hysdn) don't seem to be used at all, while another one (gigaset)
will stop being maintained as Paul Bolle is no longer able to
test it after the network gets shut down in September.
All three are now moved into drivers/staging to let others speak
up in case there are remaining users.
This leaves Bluetooth CMTP as the only remaining user of CAPI, but
Marcel Holtmann wishes to keep maintaining it.
For the discussion on version 1, see [2]
Unfortunately, Karsten Keil as the maintainer has not participated in
the discussion.