Eric Dumazet [Sat, 4 Dec 2021 04:53:56 +0000 (20:53 -0800)]
net: fix recent csum changes
Vladimir reported csum issues after my recent change in skb_postpull_rcsum()
Issue here is the following:
initial skb->csum is the csum of
[part to be pulled][rest of packet]
Old code:
skb->csum = csum_sub(skb->csum, csum_partial(pull, pull_length, 0));
New code:
skb->csum = ~csum_partial(pull, pull_length, ~skb->csum);
This is broken if the csum of [pulled part]
happens to be equal to skb->csum, because end
result of skb->csum is 0 in new code, instead
of being 0xffffffff
Two first patches add a generic infrastructure, that will be used
to get tracking of refcount increments/decrements.
The general idea is to be able to precisely pair each decrement with
a corresponding prior increment. Both share a cookie, basically
a pointer to private data storing stack traces.
The third patch adds dev_hold_track() and dev_put_track() helpers
(CONFIG_NET_DEV_REFCNT_TRACKER)
Then a series of 20 patches converts some dev_hold()/dev_put()
pairs to new hepers : dev_hold_track() and dev_put_track().
Hopefully this will be used by developpers and syzbot to
root cause bugs that cause netdevice dismantles freezes.
With CONFIG_PCPU_DEV_REFCNT=n option, we were able to detect
some class of bugs, but too late (when too many dev_put()
were happening).
Another series will be sent after this one is merged.
v3: moved NET_DEV_REFCNT_TRACKER to net/Kconfig.debug
added "depends on DEBUG_KERNEL && STACKTRACE_SUPPORT"
to hopefully get rid of kbuild reports for ARCH=nios2
Reworded patch 3 changelog.
Added missing htmldocs (Jakub)
v2: added four additional patches,
added netdev_tracker_alloc() and netdev_tracker_free()
addressed build error (kernel bots),
use GFP_ATOMIC in test_ref_tracker_timer_func()
====================
Eric Dumazet [Sun, 5 Dec 2021 04:21:57 +0000 (20:21 -0800)]
net: add net device refcount tracker infrastructure
net device are refcounted. Over the years we had numerous bugs
caused by imbalanced dev_hold() and dev_put() calls.
The general idea is to be able to precisely pair each decrement with
a corresponding prior increment. Both share a cookie, basically
a pointer to private data storing stack traces.
This patch adds dev_hold_track() and dev_put_track().
To use these helpers, each data structure owning a refcount
should also use a "netdevice_tracker" to pair the hold and put.
It can be hard to track where references are taken and released.
In networking, we have annoying issues at device or netns dismantles,
and we had various proposals to ease root causing them.
This patch adds new infrastructure pairing refcount increases
and decreases. This will self document code, because programmers
will have to associate increments/decrements.
This is controled by CONFIG_REF_TRACKER which can be selected
by users of this feature.
This adds both cpu and memory costs, and thus should probably be
used with care.
Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Manish Chopra [Thu, 2 Dec 2021 21:01:57 +0000 (13:01 -0800)]
qed*: esl priv flag support through ethtool
ESL(Enhanced System Lockdown) was designed to lock PCI adapter firmware
images and prevent changes to critical non-volatile configuration data
so that uncontrolled, malicious or unintentional modification to the
adapters are avoided, ensuring it's operational state. Once this feature is
enabled, the device is locked, rejecting any modification to non-volatile
images. Once unlocked, the protection is off such that firmware and
non-volatile configurations may be altered.
Driver just reflects the capability and status of this through
the ethtool private flag.
Manish Chopra [Thu, 2 Dec 2021 21:01:56 +0000 (13:01 -0800)]
qed*: enhance tx timeout debug info
This patch add some new qed APIs to query status block
info and report various data to MFW on tx timeout event
Along with that it enhances qede to dump more debug logs
(not just specific to the queue which was reported by stack)
on tx timeout which includes various other basic metadata about
all tx queues and other info (like status block etc.)
Dan Carpenter [Fri, 3 Dec 2021 09:55:31 +0000 (12:55 +0300)]
net: lan966x: fix a IS_ERR() vs NULL check in lan966x_create_targets()
The devm_ioremap() function does not return error pointers. It returns
NULL.
Fixes: db8bcaad5393 ("net: lan966x: add the basic lan966x driver") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Fri, 3 Dec 2021 07:04:18 +0000 (15:04 +0800)]
net: prestera: acl: fix return value check in prestera_acl_rule_entry_find()
rhashtable_lookup_fast() returns NULL pointer not ERR_PTR().
Return rhashtable_lookup_fast() directly to fix this.
Fixes: 47327e198d42 ("net: prestera: acl: migrate to new vTCAM api") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Shevchenko [Thu, 2 Dec 2021 21:00:29 +0000 (23:00 +0200)]
net: dsa: vsc73xxx: Get rid of duplicate of_node assignment
GPIO library does copy the of_node from the parent device of
the GPIO chip, there is no need to repeat this in the individual
drivers. Remove assignment here.
For the details one may look into the of_gpio_dev_init() implementation.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Chris Mi [Wed, 1 Dec 2021 13:31:53 +0000 (15:31 +0200)]
net/sched: act_ct: Offload only ASSURED connections
Short-lived connections increase the insertion rate requirements,
fill the offload table and provide very limited offload value since
they process a very small amount of packets. The ct ASSURED flag is
designed to filter short-lived connections for early expiration.
Offload connections when they are ESTABLISHED and ASSURED.
Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jie Wang [Fri, 3 Dec 2021 09:20:59 +0000 (17:20 +0800)]
net: hns3: fix hns3 driver header file not self-contained issue
The hns3 driver header file uses the structure of other files, but does
not include corresponding file, which causes a check warning that the
header file is not self-contained.
Therefore, the required header file is included in the header file, and
the structure declaration is added to the header file to avoid cyclic
dependency of the header file.
Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hao Chen [Fri, 3 Dec 2021 09:20:58 +0000 (17:20 +0800)]
net: hns3: replace one tab with space in for statement
Replace one tab with space between symbol ')' and '{' in for statement of
function hclge_map_tqp().
Signed-off-by: Hao Chen <chenhao288@hisilicon.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hao Chen [Fri, 3 Dec 2021 09:20:57 +0000 (17:20 +0800)]
net: hns3: remove rebundant line for hclge_dbg_dump_tm_pg()
Return value judgment should follow the function call, so remove line
between them.
Signed-off-by: Hao Chen <chenhao288@hisilicon.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hao Chen [Fri, 3 Dec 2021 09:20:56 +0000 (17:20 +0800)]
net: hns3: add comments for hclge_dbg_fill_content()
When we use hclge_dbg_fill_content() to fill contents with
specific format according to struct hclge_dbg_item *items,
it may cause content cover due to unreasonable items.
So add comments to explain how to avoid it.
Signed-off-by: Hao Chen <chenhao288@hisilicon.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hao Chen [Fri, 3 Dec 2021 09:20:55 +0000 (17:20 +0800)]
net: hns3: add void before function which don't receive ret
Add void before function which don't receive ret to improve code
readability.
Signed-off-by: Hao Chen <chenhao288@hisilicon.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hao Chen [Fri, 3 Dec 2021 09:20:54 +0000 (17:20 +0800)]
net: hns3: align return value type of atomic_read() with its output
Change output value type of atomic_read() from %u to %d.
Signed-off-by: Hao Chen <chenhao288@hisilicon.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hao Chen [Fri, 3 Dec 2021 09:20:52 +0000 (17:20 +0800)]
net: hns3: Align type of some variables with their print type
The c language has a set of implicit type conversions, when
two variables perform bitwise or arithmetic operations.
For example, variable A (type u16/u8) -1, its output is int type variable.
u16/u8 will convert to int type implicitly before it does arithmetic
operations. So, change 1 to unsigned type.
Signed-off-by: Hao Chen <chenhao288@hisilicon.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Guangbin Huang [Fri, 3 Dec 2021 09:20:50 +0000 (17:20 +0800)]
net: hns3: refactor function hclge_set_vlan_filter_hw
Function hclge_set_vlan_filter_hw() is a bit too long, so add a new
function hclge_need_update_port_vlan() to simplify code and improve
code readability.
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yufeng Mo [Fri, 3 Dec 2021 09:20:49 +0000 (17:20 +0800)]
net: hns3: optimize function hclge_cfg_common_loopback()
hclge_cfg_common_loopback() is a bit too long, so
encapsulate hclge_cfg_common_loopback_cmd_send() and
hclge_cfg_common_loopback_wait() two functions to
improve readability.
Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 3 Dec 2021 03:00:17 +0000 (19:00 -0800)]
Merge tag 'mlx5-updates-2021-12-02' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2021-12-02
Misc updates to mlx5 driver
1) Various code cleanups
2) Error path handling fixes of latest features
3) Print more information on pci error handling
4) Dynamically resize flow counters query buffer
====================
The flow counters bulk query buffer is allocated once during
mlx5_fc_init_stats(). For PFs and VFs this buffer usually takes a little
more than 512KB of memory, which is aligned to the next power of 2, to
1MB. For SFs, this buffer is reduced and takes around 128 Bytes.
The buffer size determines the maximum number of flow counters that
can be queried at a time. Thus, having a bigger buffer can improve
performance for users that need to query many flow counters.
There are cases that don't use many flow counters and don't need a big
buffer (e.g. SFs, VFs). Since this size is critical with large scale,
in these cases the buffer size should be reduced.
In order to reduce memory consumption while maintaining query
performance, change the query buffer's allocation scheme to the
following:
- First allocate the buffer with small initial size.
- If the number of counters surpasses the initial size, resize the
buffer to the maximum size.
The buffer only grows and isn't shrank, because users with many flow
counters don't care about the buffer size and we don't want to add
resize overhead if the current number of counters drops.
This solution is preferable to the current one, which is less accurate
and only addresses SFs.
Roi Dayan [Tue, 9 Nov 2021 12:07:49 +0000 (14:07 +0200)]
net/mlx5e: TC, Set flow attr ip_version earlier
Setting flow attr ip_version is not related to parsing tc flow actions.
It needs to be set after parsing flower matches which changes the spec.
So move it outside parse_tc_fdb_actions() and set it in
__mlx5e_add_fdb_flow().
Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
net/mlx5e: Hide function mlx5e_num_channels_changed
No calls for mlx5e_num_channels_changed() out of en_main.c,
turn it static and remove from header.
Keep the wrapper function mlx5e_num_channels_changed_ctx exposed.
Dan Carpenter [Sat, 27 Nov 2021 14:19:53 +0000 (17:19 +0300)]
net/mlx5: SF, silence an uninitialized variable warning
This code sometimes calls mlx5_sf_hw_table_hwc_init() when "ext_base_id"
is uninitialized. It's not used on that path, but it generates a static
checker warning to pass uninitialized variables to another function.
It may also generate runtime UBSan warnings depending on if the
mlx5_sf_hw_table_hwc_init() function is inlined or not.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Arnd Bergmann [Mon, 8 Nov 2021 11:10:33 +0000 (12:10 +0100)]
mlx5: fix mlx5i_grp_sw_update_stats() stack usage
The mlx5e_sw_stats structure has grown to the point of triggering
a warning when put on the stack of a function:
mlx5/core/ipoib/ipoib.c: In function 'mlx5i_grp_sw_update_stats':
mlx5/core/ipoib/ipoib.c:136:1: error: the frame size of 1028 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
In this case, only five of the structure members are actually set,
so it's sufficient to have those as separate local variables.
As en_rep.c uses 'struct rtnl_link_stats64' for this, just use
the same one here for consistency.
Arnd Bergmann [Mon, 8 Nov 2021 11:10:32 +0000 (12:10 +0100)]
mlx5: fix psample_sample_packet link error
When PSAMPLE is a loadable module, built-in drivers cannot use it:
aarch64-linux-ld: drivers/net/ethernet/mellanox/mlx5/core/en/tc/sample.o: in function `mlx5e_tc_sample_skb':
sample.c:(.text+0xd68): undefined reference to `psample_sample_packet'
Add the same dependency here that is used for MLXSW
- rds: correct socket tunable error in rds_tcp_tune()
- ipv6: fix memory leak in fib6_rule_suppress
- wireguard: reset peer src endpoint when netns exits
- wireguard: improve resilience to DoS around incoming handshakes
- tcp: fix page frag corruption on page fault which involves TCP
- mpls: fix missing attributes in delete notifications
- mt7915: fix NULL pointer dereference with ad-hoc mode
Misc:
- rt2x00: be more lenient about EPROTO errors during start
- mlx4_en: update reported link modes for 1/10G"
* tag 'net-5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (85 commits)
net: dsa: b53: Add SPI ID table
gro: Fix inconsistent indenting
selftests: net: Correct case name
net/rds: correct socket tunable error in rds_tcp_tune()
mctp: Don't let RTM_DELROUTE delete local routes
net/smc: Keep smc_close_final rc during active close
ibmvnic: drop bad optimization in reuse_tx_pools()
ibmvnic: drop bad optimization in reuse_rx_pools()
net/smc: fix wrong list_del in smc_lgr_cleanup_early
Fix Comment of ETH_P_802_3_MIN
ethernet: aquantia: Try MAC address from device tree
ipv4: convert fib_num_tclassid_users to atomic_t
net: avoid uninit-value from tcp_conn_request
net: annotate data-races on txq->xmit_lock_owner
octeontx2-af: Fix a memleak bug in rvu_mbox_init()
net/mlx4_en: Fix an use-after-free bug in mlx4_en_try_alloc_resources()
vrf: Reset IPCB/IP6CB when processing outbound pkts in vrf dev xmit
net: qlogic: qlcnic: Fix a NULL pointer dereference in qlcnic_83xx_add_rings()
net: dsa: mv88e6xxx: Link in pcs_get_state() if AN is bypassed
net: dsa: mv88e6xxx: Fix inband AN for 2500base-x on 88E6393X family
...
Linus Torvalds [Thu, 2 Dec 2021 19:07:41 +0000 (11:07 -0800)]
Merge tag 'trace-v5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"Three tracing fixes:
- Allow compares of strings when using signed and unsigned characters
- Fix kmemleak false positive for histogram entries
- Handle negative numbers for user defined kretprobe data sizes"
* tag 'trace-v5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
kprobes: Limit max data_size of the kretprobe instances
tracing: Fix a kmemleak false positive in tracing_map
tracing/histograms: String compares should not care about signed values
Linus Torvalds [Thu, 2 Dec 2021 18:56:16 +0000 (10:56 -0800)]
Merge tag 'for-linus-5.16-2' of git://github.com/cminyard/linux-ipmi
Pull IPMI fixes from Corey Minyard:
"Some changes that went in 5.16 had issues. When working on the design
a piece was redesigned and things got missed. And the message type was
not being initialized when it was allocated, resulting in crashes.
In addition, the IPMI driver has had a shutdown issue where it could
still have an item in a system workqueue after it had been shutdown.
Move to a private workqueue to avoid that problem"
* tag 'for-linus-5.16-2' of git://github.com/cminyard/linux-ipmi:
ipmi:ipmb: Fix unknown command response
ipmi: fix IPMI_SMI_MSG_TYPE_IPMB_DIRECT response length checking
ipmi: fix oob access due to uninit smi_msg type
ipmi: msghandler: Make symbol 'remove_work_wq' static
ipmi: Move remove_work to dedicated workqueue
Currently autoloading for SPI devices does not use the DT ID table, it
uses SPI modalises. Supporting OF modalises is going to be difficult if
not impractical, an attempt was made but has been reverted, so ensure
that module autoloading works for this driver by adding an id_table
listing the SPI IDs for everything.
Fixes: 96c8395e2166 ("spi: Revert modalias changes") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Horatiu Vultur [Thu, 2 Dec 2021 08:15:11 +0000 (09:15 +0100)]
net: lan966x: Fix builds for lan966x driver
The lan966x is using the function 'packing' to create/extract the
information for the IFH, that is used to be added in front of the frames
when they are injected/extracted.
Therefore update the Kconfig to select config option 'PACKING' whenever
lan966x driver is enabled.
Fixes: db8bcaad539314 ("net: lan966x: add the basic lan966x driver") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Horatiu Vultur [Thu, 2 Dec 2021 11:05:04 +0000 (12:05 +0100)]
dt-bindings: net: lan966x: Add additional properties for lan966x
This patch updates the dt-bindings for lan966x switch.
It adds the properties 'additionalProperties' and
'unevaluatedProperties' for ethernet-ports and ports nodes. In this way
it is not possible to add more properties to these nodes.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
qed: Enhance rammod debug prints to provide pretty details
Instead of printing numbers of protocol IDs and rammod commands,
enhance debug prints to provide names. s_protocol_types[] and
s_ramrod_cmd_ids arrays[] are added to support along with APIs.
Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: Alok Prasad <palok@marvell.com> Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Li Zhijian [Thu, 2 Dec 2021 02:28:41 +0000 (10:28 +0800)]
selftests: net: Correct case name
ipv6_addr_bind/ipv4_addr_bind are function names. Previously, bind test
would not be run by default due to the wrong case names
Fixes: 34d0302ab861 ("selftests: Add ipv6 address bind tests to fcnal-test") Fixes: 75b2b2b3db4c ("selftests: Add ipv4 address bind tests to fcnal-test") Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Horatiu Vultur [Wed, 1 Dec 2021 14:53:51 +0000 (15:53 +0100)]
net: lan966x: Fix duplicate check in frame extraction
The blamed commit generates the following smatch static checker warning:
drivers/net/ethernet/microchip/lan966x/lan966x_main.c:515 lan966x_xtr_irq_handler()
warn: duplicate check 'sz < 0' (previous on line 502)
This patch fixes this issue removing the duplicate check 'sz < 0'
Fixes: d28d6d2e37d10d ("net: lan966x: add port module support") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net/rds: correct socket tunable error in rds_tcp_tune()
Correct an error where setting /proc/sys/net/rds/tcp/rds_tcp_rcvbuf would
instead modify the socket's sk_sndbuf and would leave sk_rcvbuf untouched.
Fixes: c6a58ffed536 ("RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket") Signed-off-by: William Kucharski <william.kucharski@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Matt Johnston [Wed, 1 Dec 2021 08:07:42 +0000 (16:07 +0800)]
mctp: Don't let RTM_DELROUTE delete local routes
We need to test against the existing route type, not
the rtm_type in the netlink request.
Fixes: 83f0a0b7285b ("mctp: Specify route types, require rtm_type in RTM_*ROUTE messages") Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Lu [Wed, 1 Dec 2021 06:42:16 +0000 (14:42 +0800)]
net/smc: Keep smc_close_final rc during active close
When smc_close_final() returns error, the return code overwrites by
kernel_sock_shutdown() in smc_close_active(). The return code of
smc_close_final() is more important than kernel_sock_shutdown(), and it
will pass to userspace directly.
Fix it by keeping both return codes, if smc_close_final() raises an
error, return it or kernel_sock_shutdown()'s.
Link: https://lore.kernel.org/linux-s390/1f67548e-cbf6-0dce-82b5-10288a4583bd@linux.ibm.com/ Fixes: 606a63c9783a ("net/smc: Ensure the active closing peer first closes clcsock") Suggested-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Tony Lu <tonylu@linux.alibaba.com> Reviewed-by: Wen Gu <guwen@linux.alibaba.com> Acked-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
ibmvnic: drop bad optimization in reuse_tx_pools()
When trying to decide whether or not reuse existing rx/tx pools
we tried to allow a range of values for the pool parameters rather
than exact matches. This was intended to reuse the resources for
instance when switching between two VIO servers with different
default parameters.
But this optimization is incomplete and breaks when we try to
change the number of queues for instance. The optimization needs
to be updated, so drop it for now and simplify the code.
Fixes: bbd809305bc7 ("ibmvnic: Reuse tx pools when possible") Reported-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Reviewed-by: Rick Lindsley <ricklind@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
ibmvnic: drop bad optimization in reuse_rx_pools()
When trying to decide whether or not reuse existing rx/tx pools
we tried to allow a range of values for the pool parameters rather
than exact matches. This was intended to reuse the resources for
instance when switching between two VIO servers with different
default parameters.
But this optimization is incomplete and breaks when we try to
change the number of queues for instance. The optimization needs
to be updated, so drop it for now and simplify the code.
Fixes: 489de956e7a2 ("ibmvnic: Reuse rx pools when possible") Reported-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Reviewed-by: Rick Lindsley <ricklind@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dust Li [Wed, 1 Dec 2021 03:02:30 +0000 (11:02 +0800)]
net/smc: fix wrong list_del in smc_lgr_cleanup_early
smc_lgr_cleanup_early() meant to delete the link
group from the link group list, but it deleted
the list head by mistake.
This may cause memory corruption since we didn't
remove the real link group from the list and later
memseted the link group structure.
We got a list corruption panic when testing:
Fixes: a0a62ee15a829 ("net/smc: separate locks for SMCD and SMCR link group lists") Signed-off-by: Dust Li <dust.li@linux.alibaba.com> Acked-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xiayu Zhang [Wed, 1 Dec 2021 02:57:13 +0000 (10:57 +0800)]
Fix Comment of ETH_P_802_3_MIN
The description of ETH_P_802_3_MIN is misleading.
The value of EthernetType in Ethernet II frame is more than 0x0600,
the value of Length in 802.3 frame is less than 0x0600.
Signed-off-by: Xiayu Zhang <Xiayu.Zhang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tianhao Chai [Wed, 1 Dec 2021 02:57:06 +0000 (20:57 -0600)]
ethernet: aquantia: Try MAC address from device tree
Apple M1 Mac minis (2020) with 10GE NICs do not have MAC address in the
card, but instead need to obtain MAC addresses from the device tree. In
this case the hardware will report an invalid MAC.
Currently atlantic driver does not query the DT for MAC address and will
randomly assign a MAC if the NIC doesn't have a permanent MAC burnt in.
This patch causes the driver to perfer a valid MAC address from OF (if
present) over HW self-reported MAC and only fall back to a random MAC
address when neither of them is valid.
Signed-off-by: Tianhao Chai <cth451@gmail.com> Reviewed-by: Igor Russkikh <irusskikh@marvell.com> Reviewed-by: Hector Martin <marcan@marcan.st> Signed-off-by: David S. Miller <davem@davemloft.net>
Ansuel Smith [Tue, 30 Nov 2021 21:16:25 +0000 (22:16 +0100)]
dt-bindings: net: dsa: qca8k: improve port definition documentation
Clean and improve port definition for qca8k documentation by referencing
the dsa generic port definition and adding the additional specific port
definition.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 2 Dec 2021 02:26:35 +0000 (18:26 -0800)]
ipv4: convert fib_num_tclassid_users to atomic_t
Before commit faa041a40b9f ("ipv4: Create cleanup helper for fib_nh")
changes to net->ipv4.fib_num_tclassid_users were protected by RTNL.
After the change, this is no longer the case, as free_fib_info_rcu()
runs after rcu grace period, without rtnl being held.
Fixes: faa041a40b9f ("ipv4: Create cleanup helper for fib_nh") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Ahern <dsahern@kernel.org> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Jie Wang [Thu, 2 Dec 2021 08:36:03 +0000 (16:36 +0800)]
net: hns3: refactor function hns3_get_vector_ring_chain()
Currently hns3_get_vector_ring_chain() is a bit long. Refactor it by
extracting sub process to improve the readability.
Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jie Wang [Thu, 2 Dec 2021 08:36:02 +0000 (16:36 +0800)]
net: hns3: refactor function hclge_set_channels()
Currently hclge_set_channels() is a bit long. Refactor it by extracting
sub process to improve the readability.
Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jie Wang [Thu, 2 Dec 2021 08:36:01 +0000 (16:36 +0800)]
net: hns3: refactor function hclge_configure()
Currently hclge_configure() is a bit long. Refactor it by extracting sub
process to improve the readability.
Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jian Shen [Thu, 2 Dec 2021 08:36:00 +0000 (16:36 +0800)]
net: hns3: split function hclge_update_port_base_vlan_cfg()
Currently the function hclge_update_port_base_vlan_cfg() is a
bit long. Split it to several small functions, to improve the
readability.
Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yufeng Mo [Thu, 2 Dec 2021 08:35:59 +0000 (16:35 +0800)]
net: hns3: split function hns3_nic_net_xmit()
Function hns3_nic_net_xmit() is a bit too long. So add a
new function hns3_handle_skb_desc() to simplify code and improve
code readability.
Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jian Shen [Thu, 2 Dec 2021 08:35:58 +0000 (16:35 +0800)]
net: hns3: split function hclge_get_fd_rule_info()
Currently the function hclge_get_fd_rule_info() is a bit long.
Split it to several small functions, to improve readability.
Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jian Shen [Thu, 2 Dec 2021 08:35:57 +0000 (16:35 +0800)]
net: hns3: split function hclge_init_vlan_config()
Currently the function hclge_init_vlan_config() is a bit long.
Split it to several small functions, to simplify code and
improve code readability.
Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Thu, 2 Dec 2021 08:35:56 +0000 (16:35 +0800)]
net: hns3: refactor function hns3_fill_skb_desc to simplify code
The function hns3_fill_skb_desc is hard to read, this patch
extract 2 functions and add new a struct data to simplify the
code and Improve readability.
Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Thu, 2 Dec 2021 08:35:55 +0000 (16:35 +0800)]
net: hns3: extract macro to simplify ring stats update code
As the code to update ring stats is alike for different ring stats
type, this patch extract macro to simplify ring stats update code.
Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 28712 Comm: syz-executor.0 Tainted: G W 5.16.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Zhou Qingyang [Tue, 30 Nov 2021 16:50:39 +0000 (00:50 +0800)]
octeontx2-af: Fix a memleak bug in rvu_mbox_init()
In rvu_mbox_init(), mbox_regions is not freed or passed out
under the switch-default region, which could lead to a memory leak.
Fix this bug by changing 'return err' to 'goto free_regions'.
This bug was found by a static analyzer. The analysis employs
differential checking to identify inconsistent security operations
(e.g., checks or kfrees) between two code paths and confirms that the
inconsistent operations are not recovered in the current function or
the callers, so they constitute bugs.
Note that, as a bug found by static analysis, it can be a false
positive or hard to trigger. Multiple researchers have cross-reviewed
the bug.
Builds with CONFIG_OCTEONTX2_AF=y show no new warnings,
and our static analyzer no longer warns about this code.
The new SNMP variable (TCPSmallQueueFailure) can be incremented
for good reasons, even on a 100Gbit single TCP_STREAM flow.
If we really wanted to ease driver debugging [1], this would
require something more sophisticated.
[1] Usually, if a driver is delaying TX completions too much,
this can lead to stalls in TCP output. Various work arounds
have been used in the past, like skb_orphan() in ndo_start_xmit().
Zhou Qingyang [Tue, 30 Nov 2021 16:44:38 +0000 (00:44 +0800)]
net/mlx4_en: Fix an use-after-free bug in mlx4_en_try_alloc_resources()
In mlx4_en_try_alloc_resources(), mlx4_en_copy_priv() is called and
tmp->tx_cq will be freed on the error path of mlx4_en_copy_priv().
After that mlx4_en_alloc_resources() is called and there is a dereference
of &tmp->tx_cq[t][i] in mlx4_en_alloc_resources(), which could lead to
a use after free problem on failure of mlx4_en_copy_priv().
Fix this bug by adding a check of mlx4_en_copy_priv()
This bug was found by a static analyzer. The analysis employs
differential checking to identify inconsistent security operations
(e.g., checks or kfrees) between two code paths and confirms that the
inconsistent operations are not recovered in the current function or
the callers, so they constitute bugs.
Note that, as a bug found by static analysis, it can be a false
positive or hard to trigger. Multiple researchers have cross-reviewed
the bug.
Builds with CONFIG_MLX4_EN=m show no new warnings,
and our static analyzer no longer warns about this code.
Fixes: ec25bc04ed8e ("net/mlx4_en: Add resilience in low memory systems") Signed-off-by: Zhou Qingyang <zhou1615@umn.edu> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20211130164438.190591-1-zhou1615@umn.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org>